AD  A0931  88 


ADMISSIBILITY  OF  ESTIMATORS  IN  THE  ONE 
PARAMETER  EXPONENTIAL  FAMILY  AND  IN  MULTIVARIATE 
LOCATION  PROBLEMS 


Submitted  to  the  faculty  of  the  Graduate  School 
in  partial  fulfillment  of  the  requirements 
for  the  degree  Doctor  of  Philosophy 
in  the  Department  of  Mathematics 
Indiana  University 

September,  1980 


Agreed  for  public  rel* 
<Qt  ^ut  i  onjjip  g,  lt  edg 


!:***■ 


UNCLAo.) !  !'  1ED  _ • - . 

SECURTTl  LLP4SIFICATION  of  this  PAGE  Oat*  fnl^Tcd) 


T  DOCUMENTATION  PAGE 


93  / 


4.  TITLE  <mtd  Submit) 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


3.  RECIPIENT’S  CATALOG  NUMBER 


S.  TYRE  OF  REPORT  *  PERIOD  COVEREO 


j|,  ADMISSIBILITY  OF  ESTIMATORS  IN  THE  jONE 'PARAMETER 
— [  fjpPONENTIAl.  J'AMILY  AND  IN  MULTIVARIATE  jCOCATION 
If PROBLEMS .  ^  "" 


Dan  A./Ralescu 


Interim 


(.  PERFORMING  ORG.  REPORT  NUMBER 


I  ».  rnur RACT  T>n  r.RAHT  fjnumro/., 

V  >  AFOSR -76-2927  V 


».  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Indiana  University 
Department  of  Mathematics 
Bloomington,  Indiana  47405 

II.  CONTROLLING  OFFICE  NAME  AND  ADDRESS  . 

Air  Force  Office  of  Scientific  Research*// 

(AFSCr  ^ 

Bolling  Air  Force  Base,  D.C.  20332 

W  MONITORING  AGENCY  NAME  4  AOORESSf!/  dtllerent  from  Controlling  Olllct) 


I  16.  DISTRIBUTION  STATEMENT  (ol  I hit  Report) 


10.  PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  B  Wpjfr-UNIT  NUMBERS 

61102F/  230yA5  / 


j  Septwntec  P98J9 

~lf.  "NUMBfR1  oTITSES  ”  1 

70 

IS.  SECURITY  CLASS,  (el  thte  report) 

Unclassified 


IS«.  OECLASSI  FI  CATION/ DOWNGRADING 

schedule 


Approved  for  Public  Release;  Distribution  Unlimited 


<7.  distribution  STATEMENT  (ol  the  tbtlrerl  entered  In  Block  30.  II  dlllerent  Irom  Report) 


IB.  SUPPLEMENTARY  NOTES 


U.  KEY  WORDS  (Continue  on  reveree  1/  ntctiury  and  identify  by  block  numbor) 

Admissibility,  minimaxity,  exponential  family,  affine  estimators 
truncated  probability  density,  invariant  estimators,  truncated 
parameter  space. 


20.  ABSTRACT  f  Continue  on  revere*  side  If  neceeeery  end  Identity  by  block  number) 

Let  X  be  a  random  variable  with  probability  density  function 

6  x  * 

f(x)  =  B(9)e  with  respect  to  some  a-finite  measure 

y;0  €  o  =  {9  :  8~^(0)  =  Je®*  dy(x)  <  »}  .  Sufficient  conditions 

are  obtained  for  the  admissibility  of  nonlinear  estimators  of  the 
form  6  ( X)  =  (aX  +b)/(cX  +  d)  for  the  problem  of  estimating  an 
arbitrary  piecewise  continuous,  locally  integrable  function  g(9) 


J  3 


UNCLASSIFIED 


(over) 


r*U  i  •  ’■ 


_ unclassified _  -  S.  * 

secumty  CLASSIFICATION  Of  This  PAOEflOnn  Dal*  Enltrtd)  ‘ 

20.  (continued)  , 

with  squared  error  loss.  Several  applications  of  the  main 
result  are  given  and  some  new  nonlinear  admissible  estimators 
are  discovered.  The  problem  is  also  studied  for  the  case  when 
the  parameter  space  is  truncated,  that  is,  when  0  c  0Q  = 


{e  :  e  <  e  }  c  e 

o 


being  an  interior  point  of 


minimaxity  of  linear  estimators  of  the  form  aX  +  b  in  esti¬ 
mating  an  arbitrary  differentiable  function  g(0)  is  also 
investigated. 

In  the  multivariate  case  the  classes  of  estimators  which 
improve  upon  the  best  invariant  estimator  are  considered.  Let 
X  be  a  p-dimensional  random  vector  having  a  density  of  the 

form  f (x  -  9)  where  9  e  ]RP  is  a  location  parameter.  For 
p  >  3  ,  different  classes  of  estimators  6 (X)  which  are 
uniformly  better  than  the  best  invariant  estimator  6q(X)  =  X 
are  obtained  when  the  loss  function  is  of  the  type 


L(6, 6 (x) )  = 


ci  *6i  ~ 


where 


•  •  /  C 


given  positive  constants.  It  is  shown  that  6  =  6^  *  £  , 

where  6^  is  an  estimator  which  improves  upon  &Q  outside 

of  a  compact  set,  £  is  a  suitable  probability  density  in 

,  and  *  denotes  the  convolution.  Some  examples  of  den¬ 
sities  £  (such  as  truncated  densities)  which  generate  esti¬ 
mators  which  improve  upon  6Q  are  given,  and  some  problems 

of  further  research  interest  are  also  stated. 


6  =  * 


UNCI.ASSTFTFO 


Accepted  by  the  faculty  of  the  Graduate  School,  Indiana  University, 
in  partial  fulfillment  of  the  requirements  for  the  degree  of 
Doctor  of  Philosophy. 


Doctoral  Committee 


Madan  L.  Puri,  Chairman 

Grahame  Bennett 


\ 

\ 

\ 


ft 


AIR  roacx  omci  ot  rimmc  not mm  (*nc) 

NOTICB  07  TRAISKITTAL  TO  DOC 
This  technical  report -bat  Data  mlmd  and  la 
approved  for  publio  roloaao  liM  AIR  190>U  (7b)  • 
Distribution  is  unlimited. 

A.  D.  BU)SI 

Teohnioal  Zaforaatlea  Of fleer 


ACKNOWLEDGEMENTS 


It  is  a  pleasure  to  express  my  gratitude  to  Professor 
Madan  Puri  for  his  invaluable  advice  during  the  past  years. 
Without  his  constant  encouragements  and  interest  in  my  work 
this  thesis  could  not  have  been  written. 

I  am  indebted  to  Professor  Daniel  Maki  for  the  effort 
and  encouragements  he  provided. 

My  special  thanks  are  due  to  Professors  Grahame  Bennett 
and  Victor  Goodman  for  serving  in  my  thesis  committee. 

Finally,  I  would  like  to  thank  my  wife  Anca  for  her 
understanding  and  moral  support  during  all  these  years. 

This  work  was  supported  in  part  by  the  Air  Force  Office 
of  Scientific  Research,  AFSC  USAF,  Grant  No.  AFOSR -76-2927. 


1 . 


Admissibility  of  estimators  in  the  one  parameter 
exponential  family  and  in  multivariate  location  problems 


by 

Dan  A.  Ralescu 


> 


ABSTRACT 


We  are  concerned  with  problems  related  to  admissibility  and 
minimaxity  of  estimators  in  the  one  parameter  exponential  family, 
and  with  classes  of  estimators  which  improve  upon  the  best  invar¬ 
iant  estimator  in  multivariate  location  problems  (dimensions  p  ;'z  3) 
In  connection  with  admissibility  problems  we  give  sufficient 
conditions  for  the  admissibility  of  nonlinear  estimators 
(aX  +  b)/(cX  +  d)  in  estimating  an  arbitrary  function  g(ef)  with 

quadratic  loss.^More  precisely,  let  X  be  a  random  variable  with 

6x 

probability  density  function  f Q (x)  =  0(0)e  with  respect  to 
some  a-finite  measure  y  ?  0  e.  0  =  {0  :  0  (0)  =  Je  dy(x)  <  »}  . 
Sufficient  conditions  are  obtained  for  the  admissibility  of  non¬ 
linear  estimators  of  the  form  6 (X)  =  (aX  +  b)/(cX  +  d)  for  the 
problem  of  estimating  an  arbitrary  piecewise  continuous,  locally 
integrable  function  g(8)  with  squared  error  loss. 

Several  applications  of  the  main  result  are  given  and  some 
new  nonlinear  admissible  estimators  are  discovered.  In  particular, 
Xn  is  a  sample  from  the  exponential  density 


if  X^X^ 


•  •  •  / 


(x)  ,  X  >  0  ,  we  show  that  (n  -  2)/(X  +  k)  is 


Xe'Xxi.  * 

(0,oo) 

admissible  in  estimating  X 


x  =  y  x. 

i-i  1 


for  every  k  z  0  ,  where 


This  problem  is  also  studied  for  the  case  when  the  para¬ 
meter  space  is  truncated,  that  is,  when  0c  0  ={0:0s0} 

o  o 

c  0  ,  0Q  being  an  interior  point  of  0  . 

Problems  related  to  minimaxity  of  linear  estimators  are 

also  investigated.  If  X  has  a  density  belonging  to  the 

exponential  family,  we  give  sufficient  conditions  for  the 

minimaxity  of  estimators  of  the  form  aX  +  b  in  estimating 

arbitrary  differentiable  function  g(0)  . 

In  the  multivariate  case  we  concentrate  on  classes  of 

estimators  which  improve  upon  the  best  invariant  estimator. 

Let  X  be  a  p-dimensional  random  vector  having  a  density  of 

the  form  f (x  -  0)  where  0  €  ]RP  is  a  location  parameter. 

For  p  >  3  .  different  classes  of  estimators  6 (X)  which  are 

uniformly  better  than  the  best  invariant  estimator  6  (X)  =  x 

o 

are  obtained  when  the  loss  function  is  of  the  type 
P  2 

L(0 , 6  (x) )  =  I  ci  (6^  -  6i(x))  ,  where  c^  —  ,c  are  given 

i=l  p 

positive  constants. 

It  is  shown  that  6  =  6^  *  5  ,  where  5^  is  an  estimator 
which  improves  upon  <5q  outside  of  a  compact  set,  5  is  a  suit¬ 
able  probability  density  in  Rp  ,  and  *  denotes  the  convolution. 
We  give  some  examples  of  densities  £  (such  as  truncated  densities) 
which  generate  estimators  which  improve  upon  6q  ,  and  we  also 


state  some  problems  which  could  be  of  further  research  interest 
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CHAPTER  0 


PRELIMINARIES 

In  this  chapter  we  describe  the  general  decision  problem, 
the  concepts  of  admissibility  and  minimaxity,  some  related  back¬ 
ground  and  the  problems  investigated  in  Part  I  of  the  thesis. 

Let  X  be  a  random  variable  with  distribution  function 

(0.1)  F0(x)  fQ(t)dy(t) 

depending  on  an  unknown  parameter  0  e  0  s  R  .  The  set  0 
which  is  assumed  known  is  the  parameter  space.  The  quantity 
‘f0(x)  is  the  density  of  F0(x)  with  respect  to  a  o-finite 
measure  y  .  An  important  problem  in  statistics  is  to  estimate 
0  or  a  function  g<0)  of  0  on  the  basis  of  an  observation 
X  (or  series  of  observations)  on  F0  .  This  is  done  by  deter¬ 
mining  a  rule  which  for  each  set  of  values  of  observations, 
specifies  what  decision  should  be  taken.  Mathematically,  such 
a  rule  is  a  function  6  ,  which  to  each  possible  value  x  of 
X  assigns  a  decision  d  =  6 (x)  ,  that  is,  a  function  whose 
domain  is  a  set  of  values  of  X  and  whose  range  is  a  set  of 
possible  decisions. 

To  see  how  6  should  be  chosen  one  has  to  compare  the  con¬ 
sequences  of  using  different  rules.  Suppose  that  the  consequence 
of  taking  a  decision  d  in  estimating  g(9)  is  a  loss  which  can 


be  expressed  as  a  non-negative  real  number  L(g(0),  d)  .  Then 
the  long  term  average  loss  that  would  result  from  the  use  of 
5  in  a  number  of  repetitions  of  the  experiment  is  the  expec¬ 


tation  Eg[L(g (0) , 6 (X) ) ]  evaluated  under  the  assumption  that 
fg  is  the  true  density  of  X  .  This  expectation  which  depends 
on  the  decision  rule  6  and  the  density  fQ  is  called  the 
risk  function  of  6  ,  and  we  shall  denote  it  by  R(g(0),6)  . 
Thus 


R(g(0)  ,6)  =  EQ[L(g(0) , 5 (X) )  ] 


(0.2) 


f 


=  |  L(g(  9)  ,6(x))f0(x)dy(x)  . 


By  basing  the  decision  on  the  observations,  the  original 
problem  of  choosing  a  decision  d  with  the  loss  function 
L(g(0),d)  is  thus  replaced  by  choosing  6  with  the  average 
loss  R(g ( 0) , 6)  . 

Ideally,  one  would  like  to  select  5  which  minimizes  the 
risk  function  (0.2)  for  all  9  e  0  .  Unfortunately,  in  general 
such  "best"  decision  rules  do  not  exist.  So  sometimes  one  is 
led  to  considering  restricted  classes  of  decision  procedures 
which  possess  a  certain  degree  of  impartiality  such  as  unbiased¬ 
ness,  invariance,  minimax,  Bayes',  etc.  in  the  hope  that  within 
such  a  restricted  class  there  may  exist  a  procedure  which  is 
uniformly  best.  However,  there  are  situations  in  which  there 
exists  a  decision  procedure  <5q  with  uniformly  minimum  risk  among 
all  invariant  or  unbiased  procedures,  but  where  there  exists  a 


3. 


procedure  6  ,  not  possessing  this  impartiality  property  and 
preferable  to  6g  (see,  for  example,  Lehmann  (1959) ,  pages 
24  and  26,  problems  14  and  16) .  Thus  the  approaches  based  on 
the  principles  of  unbiasedness  or  invariance  could  be  unreliable, 
and  for  different  reasons,  the  approaches  based  on  the  minimax 
or  Bayes  principle  could  also  be  far  from  satisfactory.  Thus, 
as  a  first  step,  one  considers  the  possibility  of  not  insisting 
on  a  unique  solution  but  asking  only  how  far  a  decision  problem 
can  be  reduced  without  loss  of  relevant  information.  This  leads 
to  the  concept  of  admissibility. 

Definition  1.  A  decision  procedure  6Q  is  said  to  be  inadmissible 
if  there  exists  another  procedure  6^  which  dominates  it  in  the 
sense  that 

R(g(0),61)  s  R(g(6),<50)  for  all  6  e  0 

(0.3) 

R(g(8),61)  <  R(g(0),6o)  for  at  least  one  0  e  0  . 

6q  is  called  admissible  if  no  such  dominating  <5^  exists. 

Thus  a  decision  procedure  Sq  can  be  eliminated  from  consider¬ 
ation  if  there  exists  a  procedure  6^  which  dominates  it. 

A  class  T  of  decision  procedures  is  said  to  be  complete 
if  for  any  6g  not  in  T  ,  there  exists  a  6^  in  T  which 
dominates  it. 

The  importance  of  admissible  procedures  lies  in  the  fact 
that  under  suitable  assumptions  on  the  loss  function  and  the 
density  function,  the  admissible  procedures  form  a  complete  class. 


In  fact,  if  a  minimal  complete  class  exists,  it  consists  exactly 
of  the  totality  of  the  admissible  procedures  (and  consequently 
there  is  no  need  to  look  outside  this  class  to  find  an  estimation 
procedure,  for  one  can  just  do  as  well  inside  the  class) .  How¬ 
ever,  the  general  question  of  resolving  the  admissibility  of  all 
estimates  measured  with  respect  to  a  suitable  loss  function  (say, 
a  quadratic  one  -  frequently  used  in  practice)  is  intrinsically 
difficult.  One  therefore  concentrates  on  the  investigation  of 
whether  some  of  the  commonly  employed  estimates  are  admissible. 

One  of  the  earliest  papers  in  this  direction  is  due  to  Hodges 
and  Lehmann  (1951)  who  used  the  Cram^r-Rao  inequality  for  the 
variance  of  an  estimator  of  the  parameter  9  to  obtain  a  criter¬ 
ion  which  implies  the  admissibility  of  point  estimators  when  the 
loss  is  proportional  to  the  square  of  the  error  of  the  estimate. 
Their  method  which  involves  the  solution  of  a  differential  inequal¬ 
ity,  is  applied  to  certain  problems  involving  the  binomial,  Poisson 
normal,  and  chi  square  distributions,  and  the  unique  admissible 
minima*  estimator  is  obtained  in  each  case.  Simultaneously  Girshick 
and  Savage  (1951) ,  while  investigating  related  problems  in  relative¬ 
ly  greater  generality,  proved  (among  other  results)  that  if  the  dis¬ 
tribution  of  X  belongs  to  a  one  parameter  exponential  family  where 

0X 

fQ(x)  =  6(0)e  ,  and  if  the  loss  function  is  the  same  as  in  Hodges 

and  Lehmann  (1951) ,  then  X  is  an  admissible  (minimax)  estimator 
of  EqX  provided  -»  <  0  <  00  .  Later  Karlin  (1958)  proved  an 
interesting  and  surprising  result  -  that  for  the  exponential  family 
given  above,  aX  for  any  a  satisfying  0  <  a  s  1  is  an  admis- 


sible  estimator  of  EqX  whenever  y  possesses  positive  measure 
in  the  regions  x  >  0  and  x  s  0  and  0  =  (-»,»)  .  On  the 
other  hand,  for  any  a  >  1  ,  ax  is  inadmissible.  In  view 
of  the  fact  that  any  contraction  of  X  (aX,  0  <  a  s  1)  is 
admissible,  it  seems  surprising  that  in  practice  one  always 
uses  the  extreme  estimate  of  this  kind.  The  criterion  of  unbias¬ 
edness  traditionally  has  dominated  the  choice  of  an  estimator 
If  the  parameter  space  0  of  0  is  not  the  full  infinite  inter¬ 
val,  then  the  problem  of  admissibility  of  aX  becomes  quite 
complicated.  It  becomes  more  so  if  0  ranges  over  a  finite 
interval.  In  this  case  the  analysis  seems  to  depend  on  the  rate 
at  which  6(0)  tends  to  0  as  9  approaches  its  boundary.  (For 
details  see  Karlin  (1958)). 

Later  Ping  (1964)  and  Gupta  (1966)  gave  sufficient  conditions 
for  the  admissibility  of  the  estimators  of  the  form  aX  +  b  for 
the  problem  considered  in  Karlin  cited  above.  More  recently,  Ghosh 
and  Meeden  (1977)  considered  the  problem  of  estimating  a  piece 
wise  continuous  function  y(Q)  ,  not  necessary  the  mean,  by  an 
estimator  of  the  form  a  X  +  b  ,  and  provided  sufficient  conditions 
for  the  admissibility  of  such  estimators.  All  these  papers  deal 
with  linear  or  affine  estimators.  However,  there  are  important 
problems  where  the  class  of  estimators  studied  include  nonlinear 
estimators.  In  particular,  this  is  the  case  in  estimating  a  func¬ 
tion  of  the  scale  parameter  in  a  Gamma  density,  or  a  function  of  the 
variance  in  a  normal  density.  The  admissibility  of  particular 
nonlinear  estimators  of  the  form  c/X  ,  where  c  is  a  constant, 
has  been  studied  by  Ghosh  and  Singh  (1970),  in  estimating  the 


parameter  of  an  exponential  density.  However,  as  of  now,  there 
are  no  general  results  available  dealing  with  non-linear  estima¬ 
tors  which  would  apply  to  a  broader  class  of  densities. 

In  Chapter  1,  Section  1.2  we  give  sufficient  conditions  for 
the  admissibility  of  nonlinear  estimators  of  the  form 
<5  ( X)  =  (aX  +  b)/(cX  +  d)  ,  in  estimating  an  arbitrary  function 
g(0)  ,  with  squared  error  loss.  The  results  obtained  include 
the  results  of  Karlin  (1958) ,  Ghosh  and  Singh  (1970) ,  and  Ghosh 
and  Meeden  (1977) ,  among  others. 

As  a  Corollary,  we  give  sufficient  conditions  for  the  admis¬ 
sibility  of  estimators  of  the  form  6 (X)  =  c/X  . 

In  Section  1.3  we  give  several  examples  of  nonlinear  admis¬ 
sible  estimators,  especially  in  estimating  a  function  of  the  scale 
parameter  in  a  Gamma  or  normal  density.  Some  new  admissible 
estimators  are  discovered.  A  surprising  example  shows  the  "almost 

inadmissibility"  (in  a  sense  to  be  made  precise  later)  of  the 

2 

commonly  used  estimator  of  the  variance  in  a  normal  (0,0  )  den¬ 
sity. 

In  Section  1.4  we  extend  the  results  of  Katz  (1961)  and  of 
Ghosh  and  Meeden  (1977)  concerning  admissibility  when  the  para¬ 
meter  space  is  truncated.  We  derive  admissible  estimators  of 
the  form  (aX  +  b)/(cX  +  d)  +  <J>(x)  ,  where  4>(X)  is  a  "correc¬ 
tion"  due  to  the  truncation. 

The  problem  of  admissibility  is  related  to  the  problem  of 
finding  minimax  estimators.  To  define  the  latter  concept,  let 
R(g(0) , 6)  denote,  as  before,  the  risk  associated  with  the  esti¬ 
mator  6{X)  . 


7. 


An  estimator  is  called  minimax,  if: 

o 

(0.4)  sup  R(g(6),6  )  =  inf  sup  R(g(8),6) 

0e  0  0e  0 

where  the  infimum  is  taken  over  all  estimators  6 (X)  of  g(0)  . 
Intuitively,  the  minimax  approach  is  to  choose  an  estimator  whicn 
protects  against  the  largest  possible  risk,  when  0  varies  over 
0  .  There  is  a  considerable  amount  of  published  results  on  the 
existence  of  minimax  estimators;  see  especially  Chapter  2  of 
Wald  (1950) . 

In  the  case  of  the  one  parameter  exponential  family.  Ping 
(1964)  gave  sufficient  conditions  for  the  minimaxity  of  affine 
estimators  of  the  form  6 (X)  =  aX  +  b  ,  in  estimating  the  mean 
g(0)  =  EX  ,  under  the  normalized  squared  error  loss 

U 

(0.5)  L  (g  ( 0 )  ,  5  (x) )  ~  ~  <S(x))2 

var0X 

where  var^X  is  the  variance  of  X  . 

In  Chapter  2,  Section  2.1,  we  generalize  this  result  and  we 
give  sufficient  conditions  for  the  minimaxity  of  ax  +  b  in 
estimating  an  arbitrary  (differentiable)  function  g(8)  when 
the  loss  function  is  (0.5). 

In  Section  2.2  we  give  some  new  examples  of  minimax  estima¬ 
tors,  in  estimating  a  function  g(8)  different  from  the  mean, 

E  X  .  The  presence  of  affine  minimax  estimators  arises  especially 

0 


8. 


in  estimating  a  function  of  the  scale  parameter  in  a  Gamma 
density. 

In  Chapters  1  and  2,  the  observations  are  considered  to 
be  random  variables,  and  the  parameter  to  be  estimated  is 
one-dimensional. 

In  multivariate  estimation  problems,  the  situation  changes 
considerably.  The  standard  example  is  when  the  observation 
X  =  (X-,  ,  X„  , . . .  ,X  )  has  a  p-variate  normal  distribution  with 

1  2.  p 

mean  e  and  covariance  matrix  I  (the  p  x  p  identity 
matrix) . 

It  was  a  surprising  result  when  Stein  (1955)  and,  later, 

James  and  Stein  (1960)  showed  that,  in  dimensions  p  ^  3  ,  the 
best  invariant  estimator  6 (X)  =  X  of  a  multivatiate  normal 
mean,  is  inadmissible.  They  found  an  estimator  which  strictly 
dominates  <5  .  More  precisely,  they  proved  that  6  is  inad¬ 
missible  if  and  only  if  p  2:  3  . 

Since  then,  much  work  has  been  done  in  the  direction  of 
proving  inadmissibility  of  the  best  invariant  estimator,  for 
estimation  problems  in  a  relatively  general  framework.  Based 
on  results  of  Farrell  (1964),  many  contributions  to  this  sub¬ 
ject  were  made  by  Brown  ( (1966) , (1975) ) .  In  Brown  (1978)  a 
heuristic  approach  is  given  to  prove  admissibility  and  inadmis¬ 
sibility  of  estimators  in  a  wide  variety  of  multivariate  problems. 

The  work  of  Stein  and  Brown  suggest  a  new  problem,  namely 
that  of  finding  estimators  which  are  better  than  the  best  invar¬ 
iant  estimator  when  sampling  from  a  location  parameter  family. 

In  James  and  Stein  (1960)  estimators  for  the  mean  of  a  multi- 
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variate  normal  distribution  are  given,  which  are  better  than 
the  best  invariant  estimator  X  ,  where  X  is  an  observation 
of  the  distribution. 

Baranchik  (1964)  found  a  larger  class  of  estimators, 
better  than  X  ,  which  include  the  James-Stein  estimators. 

Until  1974,  estimators  which  were  better  than  the  best 
invariant  estimator  were  only  available  for  the  mean  vector  of 
a  multivariate  normal  distribution.  Then,  Strawderman  (1974) 
and  Berger  (1975)  found  minimax  estimators  which  are  better 
than  the  best  invariant  estimators,  when  sampling  from  certain 
spherically  symmetric  unimodal  distributions.  Later,  more 
results  along  these  lines  were  obtained,  by  Brandwein  and 
Strawderman  (1978) ,  Brandwein  (1979) ,  and  Brandwein  and  Straw¬ 
derman  (1980) .  Mainly,  these  results  concentrated  on  describ¬ 
ing  classes  of  minimax  estimators  for  the  mean  of  a  spherically 
symmetric  distribution,  for  various  classes  of  loss  functions, 
more  general  than  the  quadratic  loss. 

However,  estimators  which  improve  upon  the  best  invariant 
estimator  have  been  found  only  in  the  special  cases  of  normal 
and  spherically  symmetric  distributions.  In  Chapter  3  we  invest¬ 
igate  several  types  of  estimators  which  improve  upon  the  best 
invariant  estimator  when  the  underlying  distributions  are  not 
necessarily  normal  or  spherically  symmetric  and  also  when  the 
loss  function  is  relatively  more  general  than  the  quadratic  one 
(frequently  considered  in  literature) .  We  prove  that  (under 
suitable  assumptions)  the  convolution  of  an  estimator  6^  ,  which 
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improves  upon  6q  outside  of  a  compact  set  with  a  truncated 
probability  density  in  1R^  ,  gives  an  estimator  6  which 
improves  uniformly  upon  the  best  invariant  estimator  6q  . 

Different  examples  are  given  and,  according  to  the  con- 
voluting  density  (such  as  spherically  uniform,  truncated  densi¬ 
ties,  and  others) ,  different  classes  of  estimators  better  than 

6  are  described, 
o 

In  Section  3.3,  we  present  some  other  new  classes  of  esti¬ 
mators  which  improve  upon  the  best  invariant  estimator  6q  in 
higher  dimensions.  The  critical  dimensions  for  which  we  have 
improvement  depends  on  the  estimator  we  start  with,  and  which 
improves  upon  SQ  outside  of  a  compact  set. 

Finally,  we  also  state  some  problems,  which  were  not  solved 
in  the  present  context,  but  which  could  be  of  further  research 


interest. 


CHAPTER  I 


A  CLASS  OF  NONLINEAR  ADMISSIBLE  ESTIMATORS 


In  this  chapter  we  investigate  the  admissibility  of  non-linear 

estimators  of  the  form  (aX  +  b)/(cX  +  d)  in  the  one  parameter 

exponential  family  f  (x)  =  £(9)e  ,  in  estimating  an  arbitrary 

.  0 

function  g(0)  with  quadratic  loss.  Particular  cases  of  the 
estimators  of  the  form  c/X  are  also  studied  and  several  examples 
of  nonlinear  admissible  estimators  of  the  form  (aX  +  b)/(cX  +  d) 
and  c/X  are  given.  We  also  consider  the  problem  of  admissibility 
when  the  parameter  space  is  truncated  and  derive  the  admissible 
estimators  of  the  form  (aX  +  b)/(cX  +  d)  +  <j>  (X)  ,  where  (X) 
is  a  "correction"  due  to  truncation. 

Since  our  problem  originates  from  Karlin  (1958)  who  studied 
the  admissibility  of  linear  estimators  of  the  form  aX  in  estimat¬ 
ing  the  mean  E.X  of  the  one  parameter  exponential  family,  and 

U 

since  our  results  also  include  those  of  Ghosh  and  Meeden  (1977) , 
we  state  the  main  theorems  of  Karlin  (1958)  and,  Ghosh  and  Meeden 
(1977)  in  Section  1.1  for  appropriate  background.  In  Section  1.2 
we  present  the  main  theorem  dealing  with  the  admissibility  of  non¬ 
linear  estimators  of  the  form  (aX  +  b)/(cX  +  d)  in  estimating 
an  arbitrary  function  g(0)  with  quadratic  loss.  As  a  corollary 
we  give  sufficient  conditions  for  the  admissibility  of  the  estima¬ 
tors  of  the  form  c/X  .  In  Section  1.3  we  give  several  examples 
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of  nonlinear  admissible  estimators  of  the  form  (aX  +  b)/(cX  +  d) 

and  c/X  .  These  examples  come  especially  from  estimating  a 

function  g(A)  in  an  exponential  distribution  Ae  l,(x)  and 

lo,  oo) 

2  2 

g(o  )  in  a  normal  density  N(o,o  )  .  We  also  give  an  example 

showing  the  "almost  inadmissibility"  of  the  parameter  commonly 

2 

used  in  estimating  the  variance  in  the  normal  density  N(o,o  )  . 
In  section  1.4  we  derive  admissible  estimators  of  the  form 
(aX  +  b)/(cX  +  d)  +  <f>(x)  (where  <p(X)  is  a  "correction”)  in  the 
case  when  the  parameter  space  is  truncated. 


1.1.  Admissibility  of  linear  estimators 


In  this  section  we  give  for  the  ease  of  convenience  and  appro¬ 
priate  background  the  main  result  of  Karlin  (1958)  from  where  our 
problem  originated.  Let  the  random  variable  X  be  distributed 
according  to  the  probability  density 

(1.1)  dFQ (x)  =  0(0)e0xdy(x)  , 

where  y  is  a  o-finite  measure  defined  on  the  real  line,  6  is 
an  unknown  parameter,  and  we  assume  that  0  e  0  ,  where 


(1.2) 


o  =  {e 


/. 


0X 


dy  (x) 


Since  0  is  a  convex  subset  of  ]R  ,  it  is  an  interval  of  the 
real  line.  Let  0  and  0  be  the  upper  and  lower  end  points  of 
0  ,  respectively.  Karlin  (1958)  considered  the  problem  of  esti¬ 
mating  g(6)  =  EqX  from  a  single  observation  X  and  derived 
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a  /  0  in  estimating  an  arbitrary  piecewise  continuous,  locally 
integrable  function  g(0)  with  the  quadratic  loss  (1.3). 
Specifically,  Ghosh  and  Meeden  proved  the  following  theorem. 

Theorem  1.2.  (Ghosh  and  Meeden) .  Let  f . (x)  be  as  given  in 

b 

Theorem  1.1.  Let 


(1.6) 


4>  ( 0 )  =  8(6)  exp 


g(t)dt  -  be/a} 


where  a  is  an  interior  point  of  0  .  If 


(1-7) 


and 


(1.8) 


<{>(0)d0  -*>  +*> 


<p  ( 0 )  d0  +» 


as 


as 


b  -*•  0  , 


a  0  , 


where  c  is  an  interior  point  of  0  =  {Q,'6)  ,  then  6q  (x)  =  aX  +  b 
is  an  admissible  estimator  of  g(6)  (where  g(0)  is  an  arbitrary, 
piecewise  continuous  locally  integrable  function)  when  the  loss 
function  is  given  by  (1.3). 

These  sufficient  conditions  describe  the  tail  behavior  of  some 
improper  prior  distribution. 

Although  not  studied  in  the  present  context,  an  important  prob¬ 
lem  is  related  to  the  converse  of  Theorem  1.1.  More  precisely,  are 


15. 


these  conditions  for  admissibility  also  necessary? 

In  Karlin  (1958)  it  was  shown  that  if  one  of  the  integrals 
in  Theorem  1.1  is  convergent,  then  the  corresponding  estimator 
6q  is  inadmissible,  outside  of  a  closed  interval.  Although 
some  progress  was  made  later,  the  complete  answer  is  still  un¬ 
known  . 


1.2.  Admissibility  of  (aX  +  b)/(cX  +  d) 


In  this  section  we  consider  the  problem  of  estimating  a 
function  g ( 0 )  which  is  piecewise  continuous;  further  restric¬ 
tions  will  be  imposed  later  on  g  .  To  this  end  we  first  write 
(aX  +  b)/(cX  +  d)  as  a  formal  Bayes  estimator,  with  respect  to 
some  (generally  improper)  prior  distribution.  For  more  details 
of  this  kind  of  approach,  see  Zidek  (1970) . 

If  tt  ( 0 )  is  the  Radon-Nikodym  derivative  of  the  prior  dis¬ 
tribution  with  respect  to  the  Lebesgue  measure,  we  can  write 


(1.9) 


aX  +  b 
cX  +  d 


/ 


g ( e>  S(e)e6x7r{6)  d9 


/ 


e6xB(e)TT(9)  d6 


Integrating  by  parts,  we  get 

(1.10) 


-a  Je6x(&v)'  d0  +  b  JeQx$v  d0  =  -c^ggir)'  e9x  d9  +  d^gB iTe0x  d0 


and,  by  the  uniqueness  of  the  Laplace  transform: 
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(1.11)  -a(Bir)  •  +  b(0Tt)  =  -c(g0TT)'  +  d(g0ir)  . 


To  solve  the  above  differential  equation,  we  can  write,  after 
some  simple  calculations: 

(1.12)  (log  Bit)'  =  ~  ^  -  (log  |cg  -  a|)'  . 

For  simplicity,  in  the  above  formulas,  we  have  suppressed  the 
argument  in  denoting  functions. 

The  differential  equation  (1.12)  has  the  solution: 


(1.13)  tt(0) 


1 _ 

3  ( 0")  |  eg  ( 0 )  -  a  | 


dg(t)  -  b 
eg  ( t)  -  a 


dt 


where  a  is  an  interior  point  of  0  . 

We  mention  that  the  calculation  above  should  be  merely  viewed 
as  heuristic,  with  the  goal  of  deriving  the  expression  (1.13)  for 
the  prior  tt  . 

Throughout  the  remaining  of  this  Chapter,  we  make  the  follow¬ 
ing  assumptions: 


(Al) 

(A2) 


(A3) 


cg(9)  -  a  >  0  ,  for  all 


/  a  dt  exists' 

u 


- du(x)  <  » 

(cx  +  d) 


0  e  0 

for  any  [u,v]  c 

,  for  all  0  e  0 


0 


The  main  result  giving  sufficient  conditions  for  admissibility 
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is  contained  in  the  following. 


Theorem  1.3.  Let  the  density  of  X  be  f  „  (x)  =  B(9)e0x  and  let 

"  '  — — — -  -  U  . . 

£ be  the  endpoints  of  0  .  Suppose  that  conditions  (Al) 
through  (A3)  are  satisfied.  Denote  by 


(1.14)  0(0)  =  71(0)3(0)  (eg ( 0) 


0  x 


(cx  +  d) 


dp(x)  , 


where  tt(0)  is  given  by  (1.13). 
If 


(1.15) 


lim 

v->-0 


v 

/ 

u 


a~  (0)d9  =  oo 


lim 

u+9 


v 

/. 


o-1(0)d0  = 


i^hen  «0(X)  =  (aX  +  b)/(cX  +  d)  is  admissible  in  estimating 
g(0)  with  quadratic  loss. 


Proof:  Suppose  that  (aX  +  b)/(cX  +  d)  is  not  admissible; 
then  there  exists  an  estimator  5  ,  such  that 

00  00 

(1.16) (6(x)  -  g(0))2f0(x)dy(x)  s  f  (||  -  g (0) ) 2f Q (x) dy (x) 

-OO  -00 

for  all  0  e  0  and  with  strict  inequality  for  at  least  one 

9  e  0  . 

We  will  show  that  <5(x)  =  ~  ^  a.e.  (with  respect  to  u  ). 

First,  a  simple  calculation  shows  that  (1.16)  is  equivalent 
to: 
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/  <«<*>  -  5TTT  >2  £e(x)  a“<x> 

«,00 

(1.17) 

oo 

4  2  /  <i£-Hr  -  8 <*> > 'ii-Hr  -  9(e))  £e(x>  avM  • 

—  oo 


Multiplying  both  sides  by  tt  (given  by  (1.13)),  integrating 
over  [u,v]  c  0  ,  and  using  Fubini’s  theorem,  we  get: 


v 


J  [  J  (6(x)  -  || )  8(9)  e6x  dp(x)]ir(0)  d6 

»  n  _ no 


u 

(1.18) 
<  2 


/  '  cx  + 


-  6<x))  { 


J  (||-|-|  "  g(0))eO)  e6x  TT(6)d0}dy(x) 


By  using  (1.13)  and  assumption  (Al) ,  the  inner  integral  in 
the  right  hand  side  of  (1.18)  can  be  simplified  as  follows: 


(1.19) 

v 


J. 


(||4~a  -  g(0))6(0)  e0XTT (e)  de 


/  (S^!  -  9(9>>  (c9(e)  -  a)  e*P(e*  +  f  fgW 

J  u  "'a 


-  b 

-  a 


dt}  d6 


cx  +  d 


/. 


(cx  +  d)g(8)  -  (ax  4-  b) 
eg ( 0 )  -  a 


exp{ 0x 


V 

*/. 


dg ( t)  -  b 
eg  (t)  -  a 


dt)  de 


1 

cx  +  d 


f  af  exp  {ex  +  f  e|!t)  --  a 

J  u 


dt}  de 


(cont. ) 
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u 


cx 


■TIT  [exp  {UX  +  /  eg  it)  -  t  dt} 

^  a 


~  exP  {vx  +  f  eg ( t )  -  I  dt}  3  * 

^  a 


2 

Denote  by  T(0)  =  f  (6(x)  -  ^X  +  ^  )  3(6)e  dy(x)  ; 

—  00 

it  is  enough  to  show  that  T(0g)  =  0  ,  for  some  0^  . 

By  using  (1.18),  (1.19),  and  the  Schwarz  inequality,  we 

get 


V 

L 


T(0)TT(0)  d0 


<  2 


/  -  5<*>>  sdnr  +  /  sfriy 

_Q0  U 


~  b 
-  a 


dt) 


v 


(1.20) 


exp  (vx  +  f  cg(t) 


dgi^?  "-T  dt)  }  dy  (x) 


-  a 


a 


s  2T*5  (u)  B-*5  (u) 


i  f  eux 

(cx  +  d) 


u 


dy  (x))*5  exp(  f  d|{JJ  I™  dt) 

**a 


+  2TJs(v)  3_?5(v)  (f  - - - j  dy(x))h  e*p(f  cl(t!  -"a  dt 

V.=o  <cx  +  d)  r 


(cont. ) 


21. 


which  can  be  written  as: 


(1.23)  M"  (-Y)-  s 


M  (v)  Kir  (v)  8  (v)  (eg  (v)  -  a)2 


/vx 
— ^ — 

(cx  +  d) 


dy  (x) 


Choose  v!  »  v2  e  v  '  vi  <  v2  '  and  assume  M(vi>  >  0  •  Then 


M(vx) 


r  2 

i  r  m" 

~  M(v,)  j 


(v) 

M2  (v) 


dv 


(1.24) 


/ 


dv 


vx  Kir  (v)  B  (v)  (cg(v)  -  a)  ( 


/vx 
— ^ — 

(cx  +  d) 


Since  the  left-hand  side  is  bounded  by  [Mtv^]-1  ,  and  the 
right-hand  side  equals  K \  a-1(v)dv  (where  o  is  given 


1  f  cr-1 1 
J  V1 

by  (1.14)),  we  get  a  contradiction,  by  letting  v2  -*■  S  and 
using  the  first  part  of  the  hypothesis  (1.15). 

Case  2:  lim  inf  it  (v)T^(v)  B^fv)  (eg  (v)  -  a)  (  f  — — - j  dv(xl) 

-  v+e  'J  (cx  +  d)2  ' 


Then,  by  using  Fatou's  lemma,  we  get 


dy (x)) 


22. 


(1.25) 


v 

J  T(0)tt  (e)de  <  2tt  (u)T^(u)  B^(u)  (eg  (u)  -  a) 


V  (cx  +  dp  * 


6 

If  we  denote  by  N  (u)  =  J  T(0)TT(6)de  ,  we  can  write 


/ 


(1.26)  N2  (u)  <  4(-N>(u))tt(u)B(u)  (cg(u)  -  a)  2 


r  eux 

«/  (cx  +  d) 


dy  (x) 


Thus : 


(1.27) 


-N ~*  (u)  ^ 

N2(u) 


1 


2  r  eux 

4 it  (u)  6  (u)  (eg  (u)  -  a)  (  I - ~  dy  (x) ) 

J  (cx  +  dp 


If  N (uq)  -  0  for  some  Uq  ,  then  T(0)tt(6)  =0  a.e.  on 
[u0,6]  ;  therefore  T(eQ)  ~  0  for  some  6Q  ,  and  we  are  done. 

If  we  assume  N (u)  ?  0  for  any  u  ,  then,  by  using  the  same 
argument  as  in  Case  1,  and  the  second  half  of  the  hypothesis 
(1.15),  we  are  led  to  a  contradiction.  This  ends  the  proof. 

Remark,  The  assumption  (Al)  can  be  replaced  throughout  by 
cg(0)  -  a  <  0  ,  for  all  0  e  0  . 

Observe  that  Theorem  1.3  includes  Theorem  1.1  of  Karlin 
(1958)  ,  if  we  take  b  =  c  =  0  ,d=l  ,  and 

g ( 0 )  =  E.X  =  -p(9)/8<0)  .  To  obtain  Theorem  1.2  of  Ghosh  and 

0 


23. 


Meeden  (1977)  ,  take  c  =  0  ,  d  =  1  . 

As  a  particular  case  of  our  theorem,  we  give  sufficient 
conditions  for  the  admissibility  of  nonlinear  estimators  of  the 
form  c/X  : 


Corollary  1.1.  Suppose  that  g(0)  >0  for  all  0  e  0  , 

,v 


/. 


dt 

g(t) 


exists  for  any  [u,v]  c  0  ,  and 


-1 


l 


ex 


Y~  dy(x)  <  °°  . 


If 

rv 

i 

r°°  eex 

(1.28) 

lim 

L 

tg(0)  j 

[  — 7 

x 

M  00 

and 

rv 

-  oo 

r  ^ex 

(1.29) 

lim 

u-*-0 

L 

rg(9)  J 

L8? 

dp(x)]  •  exp(c 


f  i 

I  gTtT 


dp  (x) 


3'1  .  exp(c  f  ^ 

**  a 


dt)  de  =  od 


dt)  d0  =  oo 


then  60(X)  =  c/X  is  admissible  in  estimating  g(6)  with  quadratic 
loss . 

Remark.  The  hypotheses  in  the  Corollary  above  can  also  be  expressed 

in  the  following,  equivalent  form:  g(0)  is  positive,  l/g(0)  is 

-2 

locally  integrable  in  0  ,  and  E0 (X  )  <  «  . 


1.3.  Examples  of  nonlinear  admissible  estimators 

The  examples  to  be  presented  here  are  related  to  the  estimation 
of  a  function  of  the  scale  parameter  in  a  Gamma  density.  Examples 
1  and  2  are  concerned  with  estimating  a  function  of  the  parameter 


2 


in  an  exponential  density.  A  new  estimator  is  presented  in 

Example  2.  Example  3  is  concerned  with  the  estimation  of  the 

2 

reciprocal  of  the  variance  in  an  N(0,o  )  density.  Example 

2 

4  is  related  to  the  estimation  of  the  variance  in  an  N(0,o  ) 

density.  This  example  shows  that  the  admissible  estimator  com- 

2 

monly  used  to  estimate  a  is  "almost  inadmissible"  (in  a  sense 
to  be  made  more  precise  below) .  In  Example  6  we  find  admissible 
estimators  of  the  most  general  form  (aX  +  b)/(cX  +  d)  ,  again 
in  the  case  of  an  exponential  density. 


Example  1 .  Suppose  that  X^  ,  X ^  >  • • • »  Xn  are  independent  and 
identically  distributed  random  variables  with  exponential  density 

_  \  V 

Xe  I^0  .  (x)  ,  where  X  >  0  .  We  want  to  estimate  g(X)  =  X  . 

n 

Since  X  =  \  X.  is  a  sufficient  statistic  for  X  ,  we  can 

i=l  1 

consider  only  estimators  based  on  X  .  The  density  of  X  is  Gamma 
of  the  form 


(1.30) 


.  n 


=  fTnT 


x"-1  e”^x  I  (x) 

( 0 ,») 


By  changing  the  parameter  into  6  =  -X  ,  we  get: 
(1.3D  fe<*>--TT5T  a"'1  «6x  i(0,.)<x> 


and  we  estimate  g(9)  =  -9  . 

It  is  easy  to  see  that  conditions  (1.28)  and  (1.29)  of 
Corollary  1.1  are  satisfied  for  c  =  n  -  2  .  Thus,  if  n  2  3  , 
the  estimator  (n-2)/X  is  admissible  in  estimating  X  .  This  is 
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a  well-known  result  (see  Ghosh  and  Singh  (1970)). 


Example  2 .  Consider  again  X^,...,X  iid  with  density 
Xe  ^  (x)  ,  X  >  0  .  We  want  to  estimate  g(X)  =  X  . 

n 

It  is  easy  to  see  that  if  X  =  T  X.  ,  the  estimator 

i=l  1 

(n-2)/(X+k)  is  admissible  in  estimating  X  ,  for  any  k  ^  0  . 

This  result  does  not  seem  to  be  known.  Of  course.  Example 
1  is  a  particular  case,  for  k  =  0  . 

Also  note  that  the  estimators  (n-2)/(x+k)  ,  k  >  0  ,  and 
(n-2)/X  are  not  equivalent  (i.e.,  the  risk  of  (n-2)/(X+k) 
depends  on  k  )  ,  and,  therefore,  at  some  points  X  >  0  ,  it 
is  possible  to  improve  upon  the  risk  of  (n-2)/X  . 


Example  3.  In  this  example  we  consider  X^ ,X2 , . . . ,Xn  normally 

2 

distributed  with  mean  0  and  variance  o  >  0  .  The  function 

2 

to  be  estimated  is  l/o 

n  2  2 

Since  the  statistic  X  =  l  X.  is  sufficient  for  a  , 

i=l  x 

our  admissible  estimator  will  be  a  function  of  X  . 


It  is  well-known  that 

If  we  denote  by  0  =  -  — 

2cT 

is 


( \  x?)  /o2  is  distributed  as  • 

,  then  0  <  0  and  the  density  of  X 


fe(x) 


(-0)n/2 
F (n/2) 


0x  n/2-1 
e  x 


I(0,») 


(1.32) 


(x)  . 
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Also  g(o)  =  -20  >  0  .  In  looking  for  an  admissible 
estimator  of  the  form  c/X  ,  it  is  easily  seen  that  conditions 
(1.28),  (1.29)  are  satisfied  for  c  =  n-4  . 

n  2 

Thus,  if  n  >  5  ,  the  estimator  (n-4)/([  XT)  is  admis- 

i=l  1 

2  2 

sible  in  estimating  1/a  in  sampling  from  an  N(0,o  )  density. 


Example  4.  Let  us  consider  again  X2,...,  Xn  iid  with  den- 

2  2  2 
sity  N(0,a  )  and  we  want  to  estimate  g (a  )  =  a 


2 

If  X  =  y  XT  ,  it  is  well-known  that  X/(n+2)  is  admis- 

i«l  1  2 

sible  in  estimating  a  (this  can  be  deduced  easily,  for  example, 

by  using  Karlin's  theorem  1.1). 

By  applying  Theorem  1.3  (or,  directly,  Theorem  1.2),  we  see 

2 

that  (X  +  k)/(n  +  2)  is  admissible  in  estimating  a  ,  for 


every  k  s  0  . 


We  have  here  a  surprising  property,  showing  that  even  if 
X/(n+2)  is  admissible,  we  can  strictly  improve  upon  its  risk, 
on  "almost"  the  whole  parameter  space.  In  this  sense  we  say 
that  X/(n+2)  is  "almost  inadmissible". 

To  make  this  discussion  more  precise,  let  us  denote  by 


Y  =  Hi  +  jj  k  >  0 

Yk  n  +  2  '  K  “  u  * 


The  risk  of  (with  quadratic  loss)  can  easily  be  computed 


R  2,  .  k2  4.A  +  2_(n  *  2)o4 
K  (n  +  2)z 


(1.33) 
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The  risk  of  the  classical  estimator  Yft  =  — 9—*- 

0  n  +  2 


is 


"‘V02’  -  rrr  • 

Therefore  RfY^c^)  <  R(Yg,c^)  , 


if 


>  k/4  .  Roughly 


speaking,  if  k  goes  to  0  ,  then  the  set  on  which 
(k,o2|  <  R(Y0 


2  2 

R(Y,  ,o  )  <  R(Yn,o  )  will  "approach"  the  whole  parameter  space 


(0,«>)  . 

2 

For  large  values  of  a  ,  Y^  improves  substantially  upon 
Yg  .  Also,  since 


.  .  2,  2 ,  k (k  -  4o  )  .  k 

(1.34)  R(Y  ,o  )  -  R  (Y.  , o  )  - - 1 - > - y  , 

u  K  (n  +  2)z  (n  +  2P 

2 

it  follows  that,  for  c  <  k/4  ,  Y^  does  better  than  Y^  , 
but  the  improvement  is  very  small  (for  small  k  )  . 

Thus  X/(n+2)  is  "almost  inadmissible",  in  this  sense. 


Example  5.  Suppose  that  X^,  ,  Xn  is  a  sample  from  the 

q(X  CX”1  -qx 

Gamma  density:  ,  \  x  e  p  I.  (x)  ,  where  a  >  0  is 

r  la) 

known,  and  B  >  0  is  unknown. 

We  want  to  estimate  g(B)  =  B  . 
n 

Since  X  =  £  X^  is  sufficient  for  B  and  the  density  of 

i=l 

X  is  also  Gamma  with  parameters  na  and  B  ,  by  using  the  same 
technique  as  in  Example  1,  we  find  that  (na  -  2)/X  is  admissible 
for  estimating  B  . 

If  a  =  m  (an  integer)  and  n  =  1  ,  we  get  the  estimator 
(m  -  2)/X  obtained  by  Ghosh  and  Singh  (1970). 


Example  6.  In  this  example,  we  consider  again  X^,  X2»...,  XR 
iid  with  exponential  density  Xe  *x  1^  (x)  ,  X  >  o  ,  and 


we  want  to  estimate  g(X)  =  y-  ■—  y 


We  shall  find  here  two  admissible  estimators  which  have 

the  most  general  form  (aX  +  b)/(cX  +  d)  ,  with  a,  b,  c,  d  ?  0  . 

n 

We  denote  again  by  X  =  l  X.  ,  6  =  -X  .  The  density  of  X  is 

i=l  1 

given  by  (1.31)  and  g(8)  =  — — — j-  . 

We  claim  that  if  n  £  3  ,  the  estimators 


(1.35) 


1  -  X/(n 
1  +  X/(n 


(1.36) 


1  -  X/(n 
1  +  x/(n 


are  both  admissible  in  estimating  g(0>  with  quadratic  loss. 

It  is  easy  to  see  that  assumptions  (Al)  through  (A3)  are 
satisfied.  Consider  the  estimator  (1.35),  and  the  integral: 


r  ej 

.37)  /  - — 

J  - co  <cx  + 


dy  (x) 


.  r  .-I 

J  o  <*  ♦  n  -  l)2 


dx  . 


Clearly, 


(1.38) 


-°9X—  2  ,n-l  ax  s  f”  xn 

+  "  -  1)  Jo 


-3e6x  dx 


£  (n  -  2) 

,  n-2 


4  - 
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By  using  this  inequality,  it  is  easy  to  show  that  the  hypo¬ 
theses  of  theorem  1.3  are  satisfied. 

The  estimator  (1.36)  is  handled  in  a  similar  way. 


1.4.  Truncated  parameter  space 

Here  we  give  an  explicit  formula  for  an  admissible  estimator 
in  the  case  of  truncated  parameter  space. 

Instead  of  the  natural  parameter  space  0  ,  we  consider  a 
subset  of  it  ,  Oq  jc  Q  .  The  rationale  is  that  we  consider,  on 
some  a  priori  grounds,  that  the  unknown  parameter  0  is  restric 
ted  to  belong  to  Sq  • 

An  estimator  6Q  of  g(0)  is  called  0o-inadmissible,  if 
there  exists  another  estimator  6-^  of  g(0)  ,  such  that 

(1.39)  R(g(0),61>  s  R(g{0),  6q)  for  every  6  e  0Q  , 

(1.40)  R(g(0o),61)  <  R(g(0Q),6o)  for  some  eQ  e  0Q  . 

An  estimator  <5q  is  called  Op-admissible  if  it  is  not 
0Q -inadmissible. 

In  general,  there  is  no  relation  between  Op-admissibility 
and  admissibility.  In  particular,  an  admissible  estimator  need 
not  be  Op-admissible . 

We  shall  consider  here  the  particular  case  when 
Op  =  {0  <  6p 1  c  0  ,  where  0p  is  supposed  to  be  known. 
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Recall  that  X  is  an  observation  from  the  one  parameter 
exponential  density. 

The  idea  in  finding  a  Q^-admissible  estimator,  is  to  use 
the  same  prior  tt(0)  given  by  (1.13)  and  to  compute  the  gen¬ 
eralized  Bayes  estimator.  A  simple  calculation  gives: 


(1.41)  6 (x) 


g(e)B(e)  e0XTT  (9)  d9 


B(0)e0x  7f  ( 0 )  d  0 


g(9)B(B)e9xTT(0)de 


B(0)e0X7r(0)d9 


Bv  using  the  expression  (1.13)  of  ir(0)  we  get: 


exp ( 0q  + 


(1.42)  6 (X)  = 


aX  +  b 
cX  +  d 


/: 


dq(t) 


cg(t)  -  a 


dt  ) 


e0  exp  ( 8X  +  f  I  l  dt) 

+  d )/  - - 

J  a  cg(0)  -  a 


(cX 


de 


In  obtaining  formula  (1.42) ,  we  need  the  following  fact,  which 
is  easy  to  prove:  if  f  e  (IR)  ,  f  is  absolutely  continuous  on 
any  interval  of  1R  ,  and  f'  e  L^GR)  •  then  lim  f(x)  =  0  . 

In  our  case,  the  function 


(1.43) 


f  (9) 


=  exp(9x  + 


dg  ( t)  -  b 
eg  ( t )  -  a 


dt) 
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satisfies  these  assumptions,  since  we  assumed  that  the  quotient 
in  the  right-hand  side  of  (1.9)  exists. 

We  can  now  give  sufficient  conditions  for  the  0Q-admissi- 
bility  of  the  estimator  6 (X)  given  by  (1.42): 


Theorem  1.4 .  Suppose ,  with  the  notations  of  Theorem  1.3,  that 


(1.44) 


Then  the  estimator  6  (X)  given  by  (1.42)  ijs  ©^-admissible  in 
estimating  g(0)  with  quadratic  loss. 

The  proof  is  similar  to  that  of  Theorem  1.3  and  will  be  omitted. 

Note  that  in  proving  the  Qg-admissibility  of  6 (X)  ,  only  the 

second  condition  in  (1.15)  is  needed,  due  to  the  truncation  of  the 
parameter  space. 

Theorem  1.4  generalizes  a  theorem  of  Katz  (1961)  and  it  is 
also  a  generalization  of  the  corresponding  result  of  Ghosh  and 
Meeden  (1977),  who  found  ©^-admissible  estimators  of  the  form 
aX  +  b  +  <}>(X)  (where  <j>(X)  is  the  "correction"  due  to  the 
truncation) . 

In  the  following,  we  give  an  example  of  admissible  estimators 
in  truncated  case,  when  sampling  from  an  exponential  density: 

Example.  Consider  X^,  X£ , X^  iid  with  exponential  density 


(x)  ,  where  X  >  0  . 
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For  the  natural  parameter  space  0  =  (0,°°)  ,  the  estimator 
(n  -  2)/X  is  admissible  in  estimating  g(X)  =  X  ,  as  in  Example 
1  of  Section  1.3. 

Suppose  that  we  know  that  X  >  1  and  we  want  to  estimate 
the  same  function  g(X)  =  X  . 

Then  it  is  easy  to  see  that  the  condition  of  Theorem  1.4  is 
satisfied,  and  the  corresponding  estimator  given  by  (1.42)  is 
(l,o°)  -admissible. 

A  simple  calculation  gives: 


(1.45)  6  (X)  =  +  (XeX 


f? 


-3  -tX  -1 
e  dt) 


An  explicit  formula  for  this  estimator  can  be  given  by  using 


-a 


(1.46)  /  ykey  dy  =  (~l)k  kj  e"a[l  +  ~  +  ...  + 
J  —00 


We  finally  obtain 


(1.47)  6 (X) 


_ Xn~3/ (n  -  3)  ! _ 

1  +  X/l!  +  X2/2!  +  ...  +  Xn**3/  (n  -  3)! 


In  the  particular  case  n  =  3  (i.e.,  there  are  three  obser¬ 


vations  X^,  X^) 


we  get: 
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(1-48)  6 (X)  =  |  +  1  . 

It  is  easv  to  compute  the  risk  of  this  estimator: 


(1-49) 


R(X,6) 


X2  -  2X  +  2 
2 


The  risk  of  6g(X)  =  l/x  (which  is  admissible  in  the  non- 
truncated  case  and  n  =  3)  is 

(1.50)  R(X,6Q)  =  X2/2  . 

We  observe  that  R(X,6)  <  R(X,6  Q)  for  X  >  1  .  This  shows, 
among  other  things,  that  6Q  is  inadmissible  for  the  truncated 
problem. 
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CHAPTER  2 

LINEAR  MINIMAX  ESTIMATORS 

The  minimax  principle  for  estimation  problems  can  be  stated 
as  follows.  Let  6  be  an  estimator  of  a  function  g(0)  ,  and 
let  R(g(0),S)  be  the  corresponding  risk  function.  An  esti¬ 
mator  6q  is  called  minimax  if 

(2.1)  sup  R(g(0),6n)  =  inf  sup  R(g(0),6)  , 

0€0  U  Set*  0€0 

where  V  denotes  the  set  of  all  estimators. 

Intuitively,  a  minimax  estimator  is  one  which  minimizes 
the  largest  possible  risk.  One  can  also  say  that  a  minimax 
estimator  is  a  Bayes  estimator  against  a  prior  distribution  on 
0  ,  which  is  least  favorable  for  the  estimation  problem  (see 
Zacks  (1971) ,  Chapter  6) . 

There  are  many  relationships  between  minimax  and  admissible 
estimators.  For  example,  if  6q  is  admissible  and  has  a  con¬ 
stant  risk,  then  6q  is  minimax. 

In  this  chapter,  we  give  sufficient  conditions  for  the  mini- 
maxity  of  the  classical  estimator  6 ^ ( X)  =  X  ,  in  estimating  an 
arbitrary  (differentiable)  function  g ( 0 )  .  Our  results  general¬ 
ize  those  of  Ping  (1964) ,  who  gave  sufficient  conditions  for 
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the  minimaxity  of  affine  estimators  of  the  form  (X  +  kX)/(l  +  X)  , 
X  >  -1  in  estimating  the  mean  E0X  =  m(0)  . 

The  fact  that  we  find  conditions  for  the  minimaxity  of  the 
usual  estimator  6 (X)  =  X  ,  and  we  do  not  consider  affine  esti¬ 
mators  of  the  most  general  form  aX  +  b  ,  causes  no  loss  in 
generality,  and  is  explained  in  Section  2.1. 

Recall  that  X  denotes  a  random  variable,  whose  probability 
density  (with  respect  to  a  o-finite  measure)  belongs  to  the  one 
parameter  exponential  family:  f @  (x)  =  6(0)  e  .  As  in  Chapter 
1  ,  6  and  Q  denote  the  endpoints  of  the  natural  parameter 
space  0  . 

In  estimating  an  arbitrary  function  g(0)  ,  the  loss  function 
which  will  be  considered  in  this  chapter  has  the  form: 


(2.2) 


lum  ,  g(0) )  -  Louden 

a  (0) 


where  o2(0)  =  Var0  X  =  E0(X  -  EQX) 2  . 

Note  that,  since  the  density  of  X  belongs  to  the  exponen- 

2 

tial  family,  we  have  c  (0)  =  m* (0)  ,  where  m(0)  =  E0X  . 

The  choice  of  the  normalized  loss  (2.2)  is  especially  desir¬ 
able  in  those  problems  for  which,  when  the  loss  is  squared  error, 
the  minimax  risk  is  infinite.  When  this  happens,  any  estimator 
is  minimax  and  the  minimax  principle  provides  no  basis  for  choice. 

In  Section  2.1  we  give  the  sufficient  condition  for  mini¬ 
maxity.  The  proof  of  Theorem  2.1  uses  the  Cramer-Rao  inequality. 

In  Section  2.2  we  give  some  examples.  While  Theorem  2.1 
includes  many  of  the  classical  minimax  estimators,  we  concentrate 


on  two  examples  where  the  function  to  be  estimated  is  different 
from  the  mean.  The  presence  of  linear  minimax  estimators  arises 
especially  in  estimating  a  function  of  the  scale  parameter  in  a 
Gamma  density. 


2.1.  Minimaxi ty  of  X 

Let  g(0)  be  a  function  of  an  unknown  parameter  0  .  Later, 
further  restrictions  will  be  imposed  on  g  .  The  risk  in  esti¬ 
mating  g (0 )  by  6 (X)  with  the  loss  (2.2)  is 

(2.3)  R(<5  (X)  ,g(9) )  =  Eq(l(6  (X)  ,g(6))}  R(S(X),0)  . 

The  following  linearity  property  of  minimax  estimators  is 
easy  to  prove:  6  (X)  is  minimax  in  estimating  g(0)  if  and  only 
if  as  (X)  +  b  is  minimax  in  estimating  ag(0)  +  b  (a, be  1R  , 
a  t  0)  . 

Because  of  this  fact,  wc  only  need  to  give  sufficient  con¬ 
ditions  for  the  minimaxity  of  X  in  estimating  g ( © )  ,  rather 
than  consider  more  general  estimators  of  the  form  aX  +  b  . 

Theorem  2.1.  Let  X  have  density  fQ(x)  =  $(0)  e6*  *  and  let 
g ( 0 )  be  a  differentiable  function  of  0  .  Denote  the  endpoints 
of  0  by  0  and  F  ,  respectively.  Suppose  that  the  following 


conditions  are  satisfied: 
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(i)  SUP  w-e>  -  <g<6>  -  *<e>»2  <  - 

ec0  0  (e)  e-^e  o*(Q) 


-l 


(ii)  lim  inf  $  (0 )  o  (0  >  exp  (-/  g(t)dt)  >  0  , 

e-*e 


-/ 


e 


/: 


(iii)  lim  /  0 (t)  exp  (/  g{s)  ds)  dt  =  «  , 

0->0 


/: 


where  a  e  Int  0  . 

Then  X  is  minimax  in  estimating  g(0)  with  loss  (2.2) 


Proof:  We  shall  use  the  Cramer-Rao  inequality.  If  <5  (X)  is  an 
estimator  of  g(0)  with  bias  function  b^  (8)  =  EQ(6(X))  -  g ( 0 )  , 
then 


(2.4) 


Var05 (X) 


(b* (0)  +  g ' (6) ) 2 


where  Ix(0)  =  EQ(^-  lo9  f9<X))  • 

2 

From  (2.4),  since  Ix(0)  =  0  for  the  exponential  family, 

we  get 


(2.5)  R( 6  (X)  , g ( 0) )  >  Cb2  ( 0)  +  (b<.(.eI.+  ] 

o^(0)  o(6) 


For  the  rest  of  the  proof,  for  simplicity,  we  suppress  the 


variables  in  denoting  functions. 
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Suppose  that  X  is  not  minimax,  i.e.. 


(2.6.) 


sup  R(X,0)  >  inf  sup  R(6(X),0) 
0  6  0 


First,  note  that  the  risk  of  X  is  given  by 


(2.7) 


R(x,e>  =  i  +  <?»)  -  m<e ) > 

a  (0) 


Thus,  from  (2.5),  (2.6),  and  (2.7),  there  exists  an  e  >  0  and 
an  estimator  6  such  that 


(2.8) 


1  Cb2  ♦  v>2  ]  <  1* 


(g(8)  -  m (e ) ) 


sup  ? - 

e  a  (0) 


Let  K  =  sup  (.9(9) -~-in(9) }  ,  and  let 

6  o(0) 


(2.9) 


u(0)  =  b(0)  +  (g  (0)  -  m(0) ) 


Since  b1  +  g1  =  mf  +  u*  ,  (2.9)  becomes 


2 

(2.10)  u2  +  (g  -  m)2  -  2(g  -  m)u  +  lEl-t, -u.'>  ...  <  (1  + 


K  -  e)  o' 


We  now  relax  the  inequality  (2.10)  (  i.e.,  neglect  the  term 

2  2 
u'  )  to  obtain  (note  that  m'  =  a  ) : 


(2.11) 


u2  -  2  (g  -  m)  u  +  2u’  <  -eo2  +  c^{K  -  iSL-T.  T-*-  )  . 


0 


By  using  the  assumption  (i) ,  there  exists  a  point  ^  £  © 


such  that  for  all  9^  <  9  <  F  ,  we  have  K  -  -■%  j  . 


Thus 


(2.12) 


2  f  2 

u  -2(g-m)u+2u’ 


for  all  6  >  0^  . 

Define  a  new  function  v  =  3 


1  exp"/i 


g(t)  dt)  u  .  Then, 


after  some  calculations,  (2.12)  simplifies  to 


•f. 


(2.13)  B2  exp (2  /  g(t)dt)v2  +  2gexp (/  g(t)dt)v'  <  -|o2  , 


which  can  be  written  as 


e 

(2.14)  3  exp ( /  g ( t) dt) v2  +  2v 


-/ 


'  <  -  I  o2B_1  exp  (-^ 


g(t)dt) 


for  all  e  >  9X  • 

Now,  by  (ii) ,  there  exists  a  constant  c  >  0  and  a  point 

e 


0^  ,  such  that  for  all  9  >  0^  ,  we  have  3  J"a  exp (- /  g(t)dt) 


1  (  f 

a  exp (-1 

Then,  for  0  >  9^  =  max(0^,0^)  ,  we  have  by  (2.14): 


■/: 


e 

(2.15)  Bexp  (  I  g  ( t)  dt)  v2  +  2v'  <  -  c23exp  (/  g(t)dt)  . 


'/ 
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Note  that  v'  <  0  for  0  >  0^  ,  i.e.,  v  decreases  for 
0  >  o'-!  ,  and  so  lim  v(0)  exists.  We  get 

r\  .  rs 


(2.16) 


2v'[v2  +  ^-1  <  -B  exp 


g(t)dt) 


for  all  0  >  8 ^  . 

Integrating  both  sides  of  (2.16)  ,  we  obtain 


(2.17) 


0  r0 

tan-1  (V^‘-)  s  -/ 

Q  r  '  PC  I 

^  II  /  II 

e,  J  0, 


6(t)  exp  (/  g(s)ds)  dt  . 


Letting  0  -*•  0  ,  the  right-hand  side  of  (2.17)  approaches  -<*> 
by  assumption  (iii)  ,  while  the  left-hand  side  is  finite.  This 
contradiction  ends  the  proof  of  the  theorem. 

Remarks.  (a)  If  we  let  h  =  ,  then,  since  exp( [J  m(t)dt) 

d0  ^(0)  ,  where  d  is  a  positive  constant.  Theorem  2.1  can  be 
restated  as  follows: 


Assume 


(i)’  sup  h2(0)  =  lim  h2(e)  <  »  , 


(ii)'  lim  inf  a  exp (-/  ah(t)dt)  >  0  , 

e-e  J  a 

/•'S’  r0 


exp( 


(iii)' 


oh(t)dt)d0  = 


00 


■*4  ■ 


Then  X  is  minimax  in  estimating  ah  +  m  with  loss  (2.2). 

If  h  =  0  ,  and  0  =  3R  ,  then  Theorem  2.1  takes  a  very 
simple  form: 

If 

(2.18)  lim  inf  o2( 8)  >  0  or  lim  inf  a2(0)  >  0  , 

0-*-+oo  ©-►— 00 

then  X  is  minimax  in  estimating  m(0)  . 

Indeed,  in  this  case  (i) '  and  (iii) 1  are  obviously  satisfied, 
while  (ii) '  (or  its  dual  form)  becomes  (2.18)  above. 

We  point  out,  however,  that  this  condition  (ii) '  ,  while 
being  sufficient,  is  not  necessary  for  the  minimaxity  of  X  . 
Indeed,  consider  a  random  variable  X  whose  distribution  is  binom- 

ee 

ial  with  parameters  (n  ,  - )  ,  where  0  e  0  =  (-00,00)  .  Then 

1  +  ey 

0 

o2(0)  =  - ne  ■  -y  ,  and  lim  02(0)  =  0  . 

(1  +  e°)z  6-»-±co 

However,  it  follows  from  Ghirschick  and  Savage  (1951)  that 
X  is  admissible  and,  having  a  constant  risk,  it  is  minimax. 

(b)  Our  condition  (i)  is  more  general  than  a  similar  con¬ 
dition  of  Ping  (1964),  and  also  (ii)  allows  for  the  limit  inferior 
to  be  infinite  (this  is,  actually,  the  case  in  many  examples) . 

(c)  It  is  interesting  to  compare  conditions  (i)  -  (iii)  with 
the  sufficient  conditions  for  admissibility  (Theorem  1.2).  While 
the  latter  involve  the  behavior  of  a  certain  integral  in  the  neigh¬ 
borhood  of  both  endpoints  0  and  1)  ,  the  former  involve  the 
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behavior  of  the  same  integral  and  of  another  expression  in  the 
neighborhood  of  one  endpoint  C0,  say)  ,  and  also  a  global  condi¬ 
tion,  (i).  This  global  condition  is  a  kind  of  maximum  principle 
for  the  square  of  the  normalized  function  h(0)  =  3 i 9 ) - ,-T_,  m I 9 1 


2.2.  Examples 

It  is  easy  to  see  that  Theorem  2.1  can  be  used  to  prove  the 
minimaxity  of  many  classical  estimators.  For  example,  if 
Xf,  X^, .  . . ,  Xn  are  normally  distributed  with  mean  u  and  variance 
1  ,  then,  by  using  Theorem  2.1  (and  simple  changes  of  scale),  it  is 
easy  to  check  that 

n 

l  x. 

i-3  1 

(2.19)  7  =  — 

is  minimax  in  estimating  the  mean  y  with  quadratic  loss. 

Some  more  examples  of  this  kind  are  given  in  Ping  (1964) 
although  his  condition  analogous  to  (i)  needs  some  special  care. 

We  now  describe  two  examples  which  are  related  to  the  estima¬ 
tion  of  a  function  of  the  scale  parameter  in  a  Gamma  density. 


Example  1.  Suppose  that  X^,X^, . . . ,  X n  are  iid  with  exponential 

w  \  Y 

density  Ae  I,„  .  (x)  where  A  >  0  .  We  want  to  estimate  the 

(0,°°) 

function 


g  ( A) 


n  + 


-A 

e 


X 


(2.20) 
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n 

Since  X  =  V  X.  is  a  sufficient  statistic  for  X  ,  we  can 
i  =  l 

consider  estimators  based  on  X  .  The  density  of  X  is  Gamma 
with  parameters  n  and  X  . 

By  changing  the  parameter  to  0  =  -X  ,  we  get 


(2-n>  V*'  '  TTO-  *'’'1  ^  ho,-) 1x1  ’  "  ‘  0  ’ 


n  +  0 

We  claim  that  X  is  minimax  in  estimating  g(0)  =  -  — g -  , 

with  the  loss  given  by  (2.2). 

It  is  easy  to  see  that  the  assumptions  (i)'-(iii)'  are 

Q  t 

satisfied  with  h  =  e  //n  .  Expand  e  /t  as  a  power  series: 

2 

efc/t  =  i  +  1  +  yy  +  -jy  +  •  •  •  •  Then 


(2.22) 


r '  1 


°° 

dt  =  log(-0)  +  l  £-j-£  +  c  , 


k=l 


where  c  is  a  constant.  Thus, 


r 

(2.23)  lim  infoexp(-l  crh(t)dt) 

J  a 


✓n  ec 


oo  k 

lim  exp (  l  i-rj-) 
B+0  k=l 


>  0  , 


and  so  (ii)'  is  satisfied. 

*  k 

Also,  since  lim  exp  (  l  At-  )  =  1  ,  there  exists  0.  <  0 
0+0  k=l  • 

such  that  for  all  6X  <  0  <  0  ,  we  have 


•4k 
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(2.24) 


exp  (/  oh(t)dt)d0  > 


f  <-  Te 

J  a 


)  d0  =  oo 


showing  that  ( i i i) '  is  satisfied.  Finally,  since  h  =  —  and 

✓n 

0=0,  (i)'  is  also  true  and  the  claim  is  proved. 


Exami 


)le  2 .  In  this  example  we  consider  X^,  X^,...,  Xn  normally 


distributed  with  mean  0  and  variance  a  >  0  .  The  function  to 

2  2o2 

be  estimated  is  (n  +  2)  a  -  — = - 

2  a  +  1 

n  2  2 

Since  X  =  Y  X.  is  sufficient  for  a  ,  we  can  restrict 
i=l 

our  attention  to  estimators  based  on  X  .  It  is  well-known  that 

(  Y  X 2  )/o2  is  X  2  •  If  we  denote  by  8  =  -  — ,  then  0  <  0 
i=l  1  n  2a^ 

and  the  density  of  X  is 


(2.25) 


f  (x)  = 

’  r (n/2) 


n/2-1  ex  . 

e  x (0 , ~) 


Also  m(6)  =  - 


&  ,  a2(S)  =  ,  and 


(2.26) 


g(8)  =  [  (-  £  -  l)/0]  -  [1/(1  -  9)  ]  . 


With  h 


=  ^\/ (1-9)  f  it  is  easi 


ly  seen  that  conditions 


(i)'  -  (iii)'  are  satisfied. 


«  2o  ^ 

Thus  X  is  minimax  in  estimating  (n  +  2)o^  -  — ~ - 

2o  +  1 
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CHAPTER  3 


IMPROVING  UPON  THE  BEST  INVARIANT  ESTIMATOR 
IN  MULTIVARIATE  LOCATION  PROBLEMS 

In  this  chapter  we  consider  the  problem  of  finding  estimators 
which  are  better  than  the  best  invariant  estimator  6^(X)  =  X  , 
where  X  is  a  p-dimensional  random  vector  whose  probability  den¬ 
sity  belongs  to  the  location  family:  f.(x)  =  f(x  -  0)  ,  6  e  ]RP  . 

We  give  sufficient  conditions  for  the  inadmissibility  in  di¬ 
mensions  p  >  3  of  the  best  invariant  estimator  6q(X)  =  X  ,  in 
estimating  the  location  parameter  0  with  convex  loss  function 
L(0,6(x))  =  L(6(x)  -  0)  .  In  the  particular  case  when  the  loss 

function  is 

P  9 

(3.1)  L (0 , 6  (x)  )  =  l  c.  (6.  (x)  -  e.) 

i=l  1  1  1 

where  c^,C2#....c  are  given  positive  constants,  and  if  we  make 
suitable  assumptions  about  the  moments  of  the  density  f(x  -  6)  , 
we  derive  various  classes  of  estimators  which  improve  upon  the  best 
invariant  estimator  6Q(X)  =  X  . 

Our  problem  originates  from  Brown  (1975)  who  proved  that  in 
dimensions  p  >  3  ,  the  estimator  6Q(X)  =  X  is  inadmissible  in 
estimating  the  mean  0  of  a  multivariate  normal  distribution  (with 
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covariance  matrix  the  identity),  under  the  loss  function  (3.1). 

In  Section  3.1  we  generalize  a  result  of  Brown  (1975)  con¬ 
cerning  sufficient  conditions  for  inadmissibility.  As  appropriate 
background  we  give  a  result  of  Brown  (1966)  concerning  the  inad¬ 
missibility  in  dimensions  p  -  3  of  the  best  invariant  estimator 
6q(X)  =  X  in  estimating  the  location  parameter  0  with  loss 
function  (3.1).  In  Section  3.2  we  prove  that  under  suitable  assump¬ 
tions,  the  convolution  of  an  estimator  6-^  (which  improves  upon  the 
best  invariant  estimator  6 Q ( X)  =  X  outside  of  a  compact  set)  with 
a  truncated  probability  density  in  ]RP  ,  gives  an  estimator  6 
which  is  uniformly  better  than  6q  .  The  estimator  6^  is  of  the 
type  6. (X)  =  (1  -  a  0) X  ,  where  a  is  a  suitable  constant  (such 

11 X II2 

estimators  are  called  James-Stein  estimators) .  We  also  give  sev¬ 
eral  examples  of  estimators  obtained  in  this  way,  which  improve 

upon  6  .  In  Section  3.3  we  derive  some  other  estimators  which 

o 

improve  upon  in  higher  dimensions.  The  critical  dimension 

for  which  we  have  an  improvement  upon  6q  depends  on  6^  ,  the 
latter  being  an  estimator  which  improves  upon  <5q  outside  of  a 
compact  set  and  which  is  not  of  the  James-Stein  type. 

Finally,  we  state  some  problems  which  were  not  solved  in  the 
present  context  and  which  could  be  of  further  research  interest. 


3.1.  The  inadmissibility  result 

Suppose  the  density  of  X  to  be  of  the  location  type  f (x  -  0) 
and  consider  the  general  loss  function  L(8,6(x))  =  L(6(x)  -  0)  . 

By  ||  •  ||  we  denote  the  Euclidean  norm  in  ]R^  and  [q]  de- 
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notes  the  largest  integer  not  exceeding  q  . 

The  key  lemma  which  is  proved  below  gives  sufficient  condi¬ 
tions  for  the  inadmissibility  of  <$q  .  It  is  a  generalization 
of  proposition  1  in  Brown  (1975) . 

Lemma  3.1.  Suppose  that  the  following  hypotheses  are  satis¬ 
fied; 

(i)  L  is  a  convex  function; 

(ii)  there  exists  an  estimator  6^  ,  whose  risk  R(0,<$^) 
is  bounded  on  compact  subsets  of  ]RP  ; 

(iii)  lim  inf  ||9]|q[R(9,S  )  -  R(6,6  )]  >  o  ,  for  some  q  >  0 

l|el~»  °  i 

Then,  for  p  s  [q]  +  1  ,  the  estimator  <5o(x)  =  x  is  inad¬ 
missible  in  estimating  6  with  loss  function  L(<S(x)  -  6)  . 

Proof :  Denote  by  A (8)  =  R(0,<$o)  -  R(9,5  )  and  let 
0  <  a  <  lim  inf  ||0||^  A  ( 0 )  •  Then,  for  some  r  >  0  ,  we  have 

l|e|b“ 

A  ( 0 )  >  a/||6|jq  ,  for  ||6||  >  r  . 

Obviously  R(0,6q)  =  RQ  (a  constant)  and,  by  (ii) ,  R(0,61) 
for  j|  0 1|  <  r  (where  B  denotes  a  suitable  constant);  Thus 


(3.2) 


a (0 )  *  4>d|e||) 


where  the  function  <J>  is  defined  by: 


(3.3) 


4>(t) 


if  0  ^  t  -  r 

if  t  >  r  . 


Now  we  use  the  "randomization  of  the  origin"  argument  of  Brown 


(1975) :  denote  by 

(3.4)  6^(x)  =  x  +  6x(x  -  t)  ,  T  €  TRP  . 

Observe  that  R(0,6^)  =  R(6  -  1,6^  ,  since  6  is  a  location 
parameter. 

The  idea  is  now  to  consider  t  as  a  random  variable  (whose 
distribution  will  be  specified  later)  and,  by  taking 


(3.5) 


62(X)  =  ETC6^(X)] 


to  try  to  improve  upon  the  risk  of  6Q  . 

A  calculation  using  the  fact  that  L  is  a  convex  function  and 
applying  the  Jensen  inequality,  gives: 

(3.6)  R(6,<52)  <  Et[R(9,6^)]  =  EX[R(6  _  T ,  <5  x )  ] 


Note  that  R(0,6q)  =  Et[R(9  -  x , 6o) D  ,  since  we  have 
R  ( 0  -  T ,  6  )  =  R(0,6^)  =  R  (  9 ,  <5  )  .  Thus: 

O  O  o 


(3.7) 


R( 0, 6  )  -  R ( 0 , 60)  2  E  [R(0  -  T, 6  )  -  R(0  -  T,  6 .)  ] 
o  Z  T  o  x 

=  E  [  A(0  -  T)  ]  >  E  [<M||  ©  -  t||)  ] 

X  ~ 


where  E  denotes  the  integral  with  respect  to  the  probability 

x 

measure  associated  to  the  random  variable  x  . 
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Now,  choose  x  uniformly  distributed  in  the  ball  with 
center  0  and  radius  K  in  ]RP  : 

(3.8)  B  10}  =  (z  f  ]RP  /  |)  z  ||  <  K}  . 

K 

We  show  that  a  suitable  choice  of  K  >  0  will  imply  that 

(3.9)  E  [<t>(J|e  -  x  ]|)  ]  >  0  ,  for  all  6  e  2RP  . 

This  will  prove  the  inadmissibility  of  6q  . 

We  have: 


(3.io)  E[ 4>  ( ||e  -  x|| )  ] 


-  B)  dx 


u\m* 

||0-x||  ^r 


*/ 


lie  - 


e-x  >r 


where  a  =  K~p  Volume  8^(0)  . 

Suppose  that  Rq  -  B  <  0  since,  otherwise,  we  are  done. 
Observe  that  if  j|  0  ||  >  K  +  r  ,  then: 


(3.11)  EC*  ( ||6  -  x|j)] 


"  aKP  / 

IWISK 


dx  >  0 


e-x 


So,  it  remains  to  find  K  >  0  ,  such  that,  for  ]|ej|  ^  K  +  r  , 
EL<(>  (||6  -  x||)  ]  >  0  . 


But,  if 


denotes  the  p-dimensional  Lebesgue  measure,  we 


have : 
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(3.12)  E[4>(||0  -  *11)3*  C(R  -  B)  X  (A)  +  - 3 - -  X  (A.)] 

aKp  °  p  (2K  +  r)q  p  1 

where  A  =  {  t  j  ||t||  ^  K}  n  { x  j  J]  t  -  0j|  r}  and 
Al  =  'f1!  IITI!  s  K}  \  A  .  Since 

(3.13)  A  (A)  +  X^A^  =  xp(BK(0))  =  aKP 
we  can  write: 

(3.14)  EC  4>  (J)  0  -  tH)]  >  -i-  [(R  B)  X  (A)  +  - 3 - -  (otKP  _  X  (A))] 

aKP  °  P  (2K  +  r)q  p 

Clearly  Xp(A)  ^  arP  and  we  get: 

EC<M|1  0  -  t||)  ] 

(3.15) 

s  ([R  -  B  -  - § - -]  arp  +  — — -  } 

aKp  °  (2K  +  r)q  (2K  +  r)q 

Since  for  p  >  [q]  +  1  we  have: 

(3.16)  lim  { [R  -  B  -  - § - “]  arp  +  - -  }  =  oo 

K->«=  (2K  +  r)q  (2K  +  r)q 

the  proof  of  the  inadmissibility  of  6q  is  completed. 

Note.  Observe  that  condition  (ii)  in  the  statement  of 
Lemma  3.1  is  more  general  than  the  corresponding  one  in  Brown  (1975) 
where  R(0,6^)  is  required  to  be  bounded  on  ]RP  . 


In  particular,  our  condition  (ii)  is  satisfied  whenever 
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R (6 , 61)  is  a  continuous  function  of  0  . 


From  now  on,  unless  otherwise  stated,  we  shall  consider 
the  more  particular  loss  function  (3.1).  Recall  that  X  has 
a  density  of  the  location  type  f (x  -  6). 

Denote  by  Z  =  X  -  0  ;  obviously  Z  has  the  density  f(x) 
which  is  independent  of  0  . 

We  shall  make  the  following  assumptions: 

(a)  E  (Z)  =  (E (Z. )  ,  E(Z0) . E(Zj)  =■  0 

12  p 

(b)  E(Z±Z.)  =  0  ,  for  all  i  /  j  . 

Clearly,  (a)  is  a  mild  assumption,  which  is  necessary  to  show 

that  the  best  invariant  estimator  of  6  is  6  (X)  =  X  . 

o 

Assumption  (b)  combined  with  (a)  states  that  Z.  and  Z. 

^  J 

are  uncorrelated  random  variables,  for  every  i  /  j  . 

We  shall  now  express  the  risk  difference  R(0,6q)  -  R(0,6)  , 
where  6  is  any  estimator  of  0  ,  in  a  special  form.  This  cal¬ 
culation,  similar  to  the  one  in  Brown  (1975) ,  is  of  fundamental 
importance  for  all  that  follows. 

Let  6  be  any  estimator;  we  have: 


P  p  p 

Me)  =  R(e.6_)  -  R(e,5)  =  l  c.[e(x.  -  o  p  -  e ( 6 .  (x)  -  e.P] 

O  .  i  X  1  1  1  1 


(3.17) 


i=l 


=  l  c  A  (0) 
i=l  1  1 


and 


Ai(0)  =  E(Xi  -  6j>(X))(Xi  +  6i(X))  -  20iE(Xi  -  ^  (X)  ) 
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Let  h (X)  =  X  -  6  (X)  and  Z  =  X  -  6  ;  then: 


A .  (0)  =  Eh.  (Z  +  0)[2Z.  -  h.  (Z  +  0)] 
(3.18)  1  1  1  x 

=  2E[Zihj.(Z  +  0)  ]  -  E[h^(Z  +  0)  ]  . 


By  using  Taylor's  formula,  we  get: 


(3.19)  hi(Z  +  0)  =  h±(0)  +  l  Zjhij(0)  +  ei(0,Z) 

3h. 

where  h.  .  (x)  =  t —  (x)  and  e.  is  an  error  term.  Therefore 
xj  3Xj  x 

(3.18)  becomes: 


(3.20)  A±{0)  =  2aihi(0)  +  2^a±  h±.  (0)  -  E[h*(Z  +  0) ]  +  e'(0) 


where  a.  =  E(Z  )  ,  a  .  =  E(Z. Z  )  ,  e:(0)  =  2E(Z  e  )  .  Note  that 

X  X  1 J  X  J  X  XX 

a.  and  a.  .  do  not  depend  on  0  ,  and  also  that,  for  the  sake 
x  ij 

of  generality,  in  this  calculation  we  do  not  assume  the  condi¬ 
tions  (a)  and  (b)  above. 

By  using  again  (3.19)  and  by  rewriting  the  error  term,  we  get: 


(3.21)  Ai(9)  =  2aihi(0)  +  2^aijh.j(0)  -  h‘(0)  +  e~(0) 


Finally,  by  using  (3.17),  we  obtain: 


(3.22) 


A(0)  =  2^c.aih.(0)  +  2  l  c.a.jh.jte)  -  \  cJiJtO)  +  e~(0) 

1  X  /  J  X 
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This  formula  can  be  written  as: 


(3.23) 


A  (e)  =  D(6)  +  &"  (0 ) 


where 


(3.24) 


D(e)  =  2^ciaihi(0)  +  2  l  cia.jhij  (6)  -  lc±  h*  (0 ) 
1  1  *  J  1 


Now,  we  can  state  the  main  result  of  this  section.  In  the 
following  theorem  and  whenever  we  shall  use  this  theorem,  we 
shall  assume  that  the  density  f (x  -  0)  satisfies  certain  moment 
properties  as  described  in  detail  in  Brown  (1966) . 


Theorem  3.1.  Suppose  that  X  has  density  f(x  -  0)  ,  0  e  3RP 
If  the  following  conditions  are  satisfied: 

(a)  Eq(X)  =  (Eq(X1) ,Eq(X2) . Eo(Xp))  =  0 

(b)  Eo(XiXj)  =  0  for  all  i  j-  j 

(c)  p  >  3 

then  the  estimator  <5Q(X)  =  X  is  inadmissible  in  estimating  0 
with  loss  function  given  by  (3.1). 

Proof :  Note  that  in  (a)  and  (b) ,  Eq  denotes  the  expected 

value  when  0=0  (these  hypotheses  are  the  same  as  (a)  and  (b) 
discussed  before) . 

We  shall  use  the  calculation  above.  A  discussion  of  the 
error  terms  appears  in  Brown  (1975) ;  a  more  complete  discussion 
is  given  in  Brown  (1966).  Since  these  apply  to  our  case,  we  do 
not  repeat  these  calculations  concerning  the  errors  here. 
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From  formula  (3.23)  and  from  Lemma  3.1.  it  is  enough  to  show 


that 


(3.25)  lim  inf  ||e)|2  D(e)  >  0 

INh00 

for  a  suitable  estimator  6 (X)  . 

Clearly,  from  assumptions  (a)  and  (b) ,  we  have: 


(3.26)  D(8)  =  2[ciaiihii(0)  "  £°i  i (6) 


2  2 

where  a^i  =  Eo(X^)  =  E(Z^)  =  Var  (Z^)  does  not  depend  on  0 


Consider  the  estimator  <$  .  (X)  =  X.  - 

l 


ex. 

l 


,  for 


!  X ||  *  1  •  Then  h  (9)  =  — — 

i  ii 


c .  a  .  .||X||" 

i  n"  " 

||  9  ||2  -  202 

- - - -z —  .  Therefore: 

lie  4 


D(0) 

(3.27) 


„  ,r  "eH2  -  2ei  v  EM 

ciaii  c.a..  ||6||  4  "  Ci  ^a2.^4 

e  ee2 

=  — ^  I2p  -  4  -  l  - ]  .  ||e||  >  1  . 

ini2  i  ^ia2iHen 


i 

Now,  since  T  - - r  =1  ,  if  we  choose 

I  I  loll 2 


(3.28) 


0  <  e  <  2 (p  -  2)  min  (c.a. 7) 

l<i<p  1  11 


and,  since  p  >  3  by  (c) ,  it  follows  that 
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(3.29)  ||ej|  2  D(e)  >  e[2p  -  4  -  - £ - -  ]  >  0  . 

min  c .  a^ . 
i 

Finally,  lim  inf  ||e|j  2  A  (6)  s  e  (2p  -  4  -  - - - x  )  >  0 

I  Mb00  c±ai± 

and,  thus,  6  is  inadmissible. 

Note.  We  mention  that  Theorem  3.1  generalizes  the  original 
result  of  Stein  (1955)  and  it  is  included  in  the  more  general 
inadmissibility  result  of  Brown  (1966)  .  The  point  here  is  the 
possibility  that  the  components  of  X  might  be  dependent  (but 
uncorrelated)  and  that  we  used  the  uniform  distribution  in  Lemma 
3.1.  As  we  shall  see  in  the  next  section,  the  latter  argument 
leads  to  possibilities  of  generalizations  and  to  a  wide  class  of 
estimators  which  improve  upon  the  best  invariant  procedure  6q  . 
We  now  give  some  examples  where  Theorem  3.1  can  be  applied: 

Example  1.  Consider  X  normally  distributed  with  mean  0 

2  2 

and  covariance  matrix  o  I  ,  where  a  is  known.  In  this  case, 
obviously,  conditions  (a)  and  (b)  are  satisfied.  Thus,  if  p  >  3  , 
the  estimator  6Q(X)  =  X  is  inadmissible.  This  is  the  classical 
problem  of  estimating  the  mean  vector  of  a  multivariate  normal 
distribution. 

Example  2 .  Consider  X  to  be  uniformly  distributed  in  the 
ball  (x  e  3RP  /  j | x  -  Qjj  <  R}  .  In  this  case  we  have: 


cxR 


P 


I{|Ni^R) 


(3.30) 


f  (x) 
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and,  by  using  polar  coordinates,  it  can  be  easily  shown  that 

(a)  and  (b)  are  satisfied.  This  is  the  case  of  a  spherically 

uniform  distribution.  Minimax  estimators,  better  than  <5  , 

o 

were  obtained  in  this  case  by  Brandwein  and  Strawderman  (1978) . 

Example  3.  Consider  the  observation  X  having  a  density 
of  the  form  f(j|x  -  6||)  ,  6  e  IRP  .  It  can  be  shown,  again  by 
using  polar  coordinates,  that  if 


(3.31) 


f(t)  dt  < 


oo 


then  conditions  (a)  and  (b)  are  satisfied.  In  particular,  any 
truncated  density  (see  the  next  section)  will  satisfy  (3.31)  and, 
therefore,  when  sampling  from  such  a  density,  the  estimator 
6q(X)  =  X  is  inadmissible,  for  p  &  3  . 


3.2.  Improving  upon  the  best  invariant  estimator 

The  problem  of  improving  upon  the  best  invariant  estimator 

6  (X)  =  X  received  considerable  attention  in  the  literature,  but 
o 

estimators  better  than  6q  are  only  known  in  special  cases,  such 
as  in  sampling  from  a  multivariate  normal  density,  or  from  a  spher¬ 
ically  symmetric  density. 

In  the  general  case  of  a  location  parameter,  let  us  observe 
that  Lemma  3.1  gives  a  family  of  estimators  which  improve  upon  6q 
in  terms  of  risk: 


(3.32)  62(X)  = 


+  6  (X  -  t) ]  dx  ,  K  s  Kq 


■vlk 


i 

aKp 


57. 


If  we  note  that 


inates  in  ]R^  ),  we  obtain: 


0  (by  using  polar  coord- 


(3.33) 


6 2  (X)  = 


aK 


-MHUk 


6  (X  -  T  )  dx 


K  £  K 


where  6^  is  an  estimator  satisfying  the  conditions  of  Lemma  3.1. 

Formula  (3.33)  shows  that  the  estimator  6^  is  obtained  as 

the  convolution  of  an  estimator  6^  which  improves  outside  of  a 

compact  set,  with  a  suitable  uniform  density  in  ]RP  . 

The  possibility  of  expressing  estimators  which  improve  upon 

6  as  a  convolution,  will  be  taken  as  the  basis  of  further  devel- 
o 

opments.  An  important  step  in  finding  wider  classes  of  estimators 
which  are  better  than  6q  ,  is  to  generalize  Lemma  3.1.  We  do  this 
below  and,  basically,  this  generalization  shows  that  a  wide  class 
of  densities  can  be  taken  instead  of  a  uniform  density,  to  generate 
estimators  which  improve  upon  <5q  . 

We  shall  consider  truncated  distributions,  whose  densities 
(with  respect  to  the  Lebesgue  measure)  are  of  the  form: 


(3.34)  S(||x|j2) 


c  n(||x!|2)  ,  if  1 1 xj |  <  K 
0  ,  if  llxll  >  K 


where  K  >  0  and 


(3.35) 


2 


)  dx  . 
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The  following  theorem  generalizes  Lemma  3.1.  Note  that  we  con¬ 
sider  again  the  general  loss  function  L(0,6(x))  =  L(6(x)  -  6)  . 


Theorem  3.2.  Suppose  the  following  hypotheses  satisfied: 
(i)  L  is  a  convex  function; 

(ii)  there  exists  an  estimator  6^  ,  whose  risk  R(0,<5^) 
is  bounded  on  compact  sets; 


(iii)  lim  inf  ||  0  |r  [R(8,6J  -  R(0,6,)]  >  0  ; 

INI- 

(iv)  sup 


./nx-v!! 


n(|jxj|  )  dx  <  ®  ,  for  any  r  >  0  ; 


yeHp  J|jx-yj!< 


(v)  lim  (1/K  ) 

K-t-oo 


n  (  Itxjl  2)dx  =  <*> 


Then,  the  estimator  defined  by 


(3.36)  6 2  (X)  =  c 

is  better  than  the  best  invariant  estimator  SQ  ,  for  K  >  Kq 
(where  Kq  is  a  sufficiently  large  constant) . 

Proof ;  The  method  of  proof  is  similar  to  that  of  Lemma  3.1 
and,  by  using  the  same  notations,  we  get: 


-Ci  i 


61(X 


-  x)  n  ( | |t|  1  2)  dx 


<K 


(3.37)  R(e,6o)  -  R(e,<$2)  2  Ex[<|)(||e  -  t|| )  ]  . 

We  choose  the  random  variable  x  distributed  with  density  E, 


as  given  by  (3.34) ,  so  that: 
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R(6, 6q)  -  R(6.62) 


(3.38) 


^  cL 


I, 


(R  -  B)n(  llr|r)dT  + 


||0-T||<r 

Ml 


J\  |0-.tII  >r 


lie  -  t||: 


-n(  IMr)  dt] 


W  UK 


Consider  ||6||  <  K  +  r  since,  otherwise,  we  are  done.  We 
obtain : 


R  ( 0  ,  <SQ)  -  R(0,<52) 


(3.39) 


>  c([R0  -  B 


(2K  +  r) 


n(|M!  2)dx  + 


i  I1  ~  ®l  I  -r 


c(2K  +  r) 


Finally,  by  using  hypotheses  (iv)  and  (v) ,  we  show  that  the 
right-hand  side  of  (3.39)  goes  to  »  as  K  ■+  °o  ,  which  concludes 
the  proof. 

Observe  that  62  is  obtained  from  6-^  by  a  randomization  of 
the  origin.  Since: 


(3.40) 


TT]  (  ||t||  2)  dx 


0 


we  get  formula  (3.36),  providing  an  estimator  62  with  smaller 
risk  than  6 

o 

Again,  the  estimator  <52  is  the  convolution  of  the  estimator 
6^  (which  improves  upon  6Q  outside  of  a  compact  set)  with  a  trun¬ 
cated  density  of  the  type  (3.34). 

We  give  now  some  examples  which,  according  to  the  choice  of 
the  density  £  ,  describe  various  classes  of  estimators  improving 
upon  6 

o 
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Example  1 .  If  we  take  n(||x||  )  =1  in  the  definition 
of  £  ,  then  (iv)  is  obviously  satisfied,  while  (v)  is  satis¬ 
fied  for  p  >  3  .  In  this  particular  case,  we  obtain  Lemma  3.1 
with  q  =  2  . 

Estimators  which  improve  upon  6q  are  given  by: 


(3.41) 


s2  (X) 


6^(X  -  x)dx 


K  >  K 

o 


and  we  recognize  again  formula  (3.33)  (i.e.,  convolution  with  a 

uniform  density) . 

2  2 

Example  2 .  Consider  n(||x||  )  =  l/||x||  .  By  using  polar 

coordinates  in  ]RP  ,  we  observe  that  condition  (v)  is  satisfied 
for  p  >  5  (see  also  remark  (1)  below) . 

We  can  also  prove  that  condition  (iv)  is  satisfied: 


(3.42) 


sup  /  - 1 j 

ye3RP  *'|jxj|<r  ^  +  ^ 


dx  <  0° 


(V)  r  >  0  . 


To  see  this,  perform  a  change  of  variables,  by  applying  the 
p  x  p  orthogonal  transformation  T  ,  such  that 
T(y)  =  ( ]jy|| ,  0 , 0,  . . . ,  0)  .  We  get 


(3.43) 


- -  (fjx  = 

ii* +  yii2 


_ i 

(x-L  +  liy  II)  2  +  x2  + 


+  X 


dx  . 


Then,  transform  (x^, x^, . . . , xp)  into  spherical  coordinates: 
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2  s 

n  ( 1|  x]|  )  =  l/||xj|  with  s  >  0  and  we  consider  the  corres¬ 
ponding  truncated  density  £  given  by  (3.34). 

Without  repeating  the  calculation  which  is  similar  to  that 
in  Example  2,  we  mention  that  classes  of  estimators  better  than 
6q  are  obtained  in  higher  dimensions.  More  exactly,  the  critical 
dimension  for  inadmissibility  depends  on  s  :  the  estimators 


(3.48)  <$2  (X)  =  c 


/6X(X  -  t 

||t||  <k  HtI|S 


—  dr  ,  K  >  K 
s  o 


have  smaller  risk  than  6q(X)  =  X  ,  for  p  >  s  +  3  .  Here  c  is 


a  constant  of  the  order  1/K*5  s  . 


Remarks .  (1)  Observe  that  condition  (v)  of  Theorem  3.2  can 

be  stated  in  the  equivalent  form: 


(3.49) 


f  “ 

J  0 


n  (t2)  dt  =  oo  and  lim  t^  ^  n  (t^)  = 


This  can  be  proved  by  using  polar  coordinates. 

(2)  In  the  proof  of  Theorem  3.1,  the  estimator  6 ^  which 
improves  upon  <$q  outside  of  a  compact  set,  is  explicitly  given. 
By  using  this  and  Theorem  3.2,  we  can  give  a  more  explicit  form 
for  the  estimator  62  «■  which  is  uniformly  better  than  6q  . 

Consider  X  an  observation  from  the  density  f (x  -  0)  ,  and 
suppose  that  the  hypotheses  of  Theorem  3.1  are  satisfied.  The  loss 
function  is  given  by  (3.1),  where  we  assume  for  simplicity  that 
c^a^  =  Q  =  constant,  for  all  i  =  l,2,...,p  . 

Then  the  estimator  6^  becomes: 
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(3.50) 


6,  (X)  =  (1  -  — £-=■  )  X 

1  qiixii2 


If  we  consider  the  "convoluting"  distribution  to  be  uni¬ 
form  in  the  ball  with  radius  K  ,  it  can  be  easily  shown  that 
(3.36)  becomes: 


(3.51) 


62<X> 


X 


/. 


j|y-X||sK 


IMS 2 


dy  . 


This  estimator  is  better  than  6  if  K  £  K  .  The  constant 

o  o 

Kq  depends  (as  it  can  be  seen  from  the  proof  of  Lemma  3.1)  on 
various  other  constants,  such  as  R  ,  B#  a,  p  .  We  can  take 


(3'52)  a  e  [ 2p  4  “  Q  min  aii  ] 

with  0  <  e  <  2(p  -  2)  •  Q  •  min  a^  ,  and  we  take  K  ,  such  that: 
(3.53)  Kp  -  rp  +  rP  (R  -  B) (2K  +  r) 2/a  >  0  . 


In  specific  examples,  the  constants  Rq  ,  B  can  be  calculated, 
or  replaced  by  some  appropriate  bounds. 

Formulae  analogous  to  (3.51)  can  be  obtained  by  using  other 
truncated  distributions,  such  as  those  given  in  the  examples  above. 


3.3.  Further  developments 


It  is  possible  to  extend  the  results  of  previous  sections  and 

to  describe  other  classes  of  estimators  which  improve  upon 

6  (X)  =  X  . 
o 
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Consider  X  having  a  density  of  the  location  type  f (x  -  6) 
and  consider  the  loss  function  (3.1)  where,  for  simplicity, 
we  assume  that  ci^ii  =  ^  VarD(Xi)  =  Q  (a  constant),  for  all 
i  =  1 .2, . . . ,p  . 

We  assume  tnat  the  hypotheses  of  Theorem  3.1  are  satisfied. 
Without  loss  of  generality,  we  can  take  c  a  =  1  ,  i  =  l,2,...,p 

Theorem  3.3.  If  we  denote  by 


(3.54) 


6,  (X)  =  (1  -  -  )  X 

1  IMIS 


with  s  >  2  and  e  >  0  ,  then 


(3.55)  5 ^ (X)  = 


-i-  f  .x 

“K  J|MI« 


( X  -  y ) dy  ,  K  i  k 


is  a  better  estimator  than  6Q(X)  =  X  ,  in  dimensions  p  >  s 


Proof :  With  the  notations  of  Section  3.1  we  have: 


(3.56)  D(e)  =  2£hii(0)  -  Jc±hj(e)  - 

A  simple  calculation  gives: 


(3.57)  h  (e)  =  c S(ei|s  -  seji|e|is  2]  . 

11  INr  1 
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Therefore,  we  obtain: 


D(0) 

(3.58) 


=  2  l 


i  !|e| 


2s 


-  INI s'2 


1  -  K 


=  2e 


-  ^  l 


Vi 


INI  INI  i  IN 


2s 


2  2 


IN! 


2s 


Since  c^  <  A  for  all  i  =  1,2, ...,p  ,  where  A  is  a 
constant,  we  get: 


(3.59) 


D(e)  >_  2e  lB-r  *>  - 

INI  INI23’2 


Therefore: 


(3.60) 


| |e| | s  D(e)  ^  2t(P  -  s)  - 


£  2  A 


INI 


s-2  * 


Since  s  >  2  ,  it  follows  that: 


(3.61)  lim  inf  ||e|]s  D(6)  a:  2e  (p  -  s)  >0 

INI- 

for  p  >  s  and  e  >  0  ,  by  applying  Lemma  3.1  with  q  =  s  .  This 
ends  the  proof. 

Note.  Observe  that  in  the  case  s  >  2  we  need  e  >  0  ,  but 
e  is  otherwise  unrestricted. 

If  s  =  2  ,  by  looking  at  the  proof  above,  we  see  that  we  need 
0  <  e  <  2(p  -  2) /A  . 

If  2<s<3  we  get  better  estimators  in  dimensions  p  fc  3  ; 
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otherwise  the  improvement  upon  <5Q(X)  =  X  is  obtained  in  high¬ 
er  dimensions. 

From  the  results  of  Section  3.2  and  3.3  it  is  clear  that 

estimators  §2  which  improve  upon  the  best  invariant  estimator 

6  are  obtained  as  the  convolution  of  some  estimator  6,  which 
o  1 

improves  upon  6Q  outside  of  a  compact  set,  with  a  suitable  pro¬ 
bability  density  £  in  ]RP  : 

(3.62)  62  =  61  *  5  • 

An  interesting  problem  would  be  to  study  and,  if  possible, 
to  characterize  the  following  class  of  densities  in  3RP  : 

(3.63)  V  =  {C/R(0,61  *  o  <  R(0,6O)  ,  (V)  6  €  TRP]  . 

Note  that  V  contains  suitable  normal  densities  (see  Brown 
(1975)),  as  well  as  spherically  uniform  densities,  and  truncated 
densities  with  properties  (iv)  and  (v)  in  Theorem  3.2. 

A  characterization  of  V  would  be  useful  in  order  to  find 
wider  classes  of  estimators  which  improve  upon  the  best  invariant 
procedure  6q(X)  =  x  • 

An  interesting  problem,  closely  related  to  this,  is  whether 
any  estimator  which  improves  upon  can  be  written  as  a  convo¬ 

lution  of  some  estimator  which  improves  outside  of  a  compact  set, 
with  a  suitable  p-dimensional  probability  density. 

The  answer  at  these  problems,  which  is  not  known  even  when 
sampling  from  the  multivariate  normal  distribution,  would  possibly 
give  a  better  understanding  of  the  structure  of  estimators  which 


67. 

improve  upon  the  classical,  best  invariant  procedure. 


i 

% 
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